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METHODS OF DIAGNOSIS OF LUNG CANCER, COMPOSITIONS AND METHODS 
OF SCREENING FOR MODULATORS OF LUNG CANCER 

5 

CROSS-REFERENCES TO RELATED APPLICATIONS 
This application is related to USSN 60/284,770, filed April 18, 2001; USSN 
60/290,492, filed May 10, 2001; USSN 60/334,370, filed November 29, 2001; USSN 
60/339,245, filed November 9, 2001 ; USSN 60/350,666, filed November 13, 2001 ; and 
1 0 USSN 60/xxx,xxx, filed April 1 2, 2002 (Docket OMNI-002P); each of which is incorporated 
herein by reference in its entirety. 

FIELD OF THE INVENTION 
The invention relates to the identification of nucleic acid and protein expression 
1 5 profiles and nucleic acids, products, and antibodies thereto that are involved in lung cancer; 
and to the use of such expression profiles and compositions in diagnosis and therapy of lung 
cancer. The invention further relates to methods for identifying and using agents and/or 
targets that inhibit lung cancer or related conditions. 

20 BACKGROUND OF THE INVENTION 

Lung cancer is the second most commonly occurring cancer in the United States and 
is the leading cause of cancer-related death. It is estimated that there are over 160,000 new 
cases of lung cancer in the United States every year. Of those who are diagnosed with lung 
cancer, 86 percent will die within five years. Lung cancer is the most common visceral 

25 cancer in men and accounts for nearly one third of all cancer deaths in both men and women. 
In fact, lung cancer accounts for 7% of all deaths, due to any cause, in both men and women. 

Smoking is the primary cause of lung cancer, with more than 80% of lung cancers 
resulting from smoking. About 400 to 500 separate gaseous substances are present in the 
smoke of a non-filter cigarette. The most noteworthy substances include nitrogen oxides, 

30 hydrogen cyanide, formaldehyde, benzene, and toluene. The particles present in cigarette 
smoke contain at least 3,500 individual compounds such as nicotine, tobacco alkaloids 
(nornicotine, anatabine, anabasine), polycyclic aromatic hydrocarbons (e.g., benzo(a)pyrene, 
B(a)P), naphthalenes, aromatic amines, phenols, and tobacco-specific nitrosamines. 
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Tobacco-specific nitrosamines are formed during tobacco curing and processing, and 
are suspected of causing lung cancer in humans. In rodent studies, regardless of the where or 
how it is applied, the tobacco-specific nitrosamine known as NNK produces lung adenomas 
and lung adenocarcinomas. The tobacco-specific nitrosamine known as NNAL also produces 
5 lung adenocarcinomas in rodents. 

Many of the chemicals found in cigarette smoke also affect the nonsmoker inhahng 
"secondhand" or sidestream smoke. Indeed, the smoke inhaled by non-smokers has a 
chemical composition similar to the smoke inhaled by smokers, but, importantly, the 
concentrations of the carcinogenic tobacco-specific nitrosamines are present in higher 
10 concentrations in second hand smoke. For this and other reasons, "passive smoking" is an 
important cause of lung cancer, causing as many as 3,000 lung cancer deaths in nonsmokers 
each year. 

In addition to smoking, other factors thought to be causes of lung cancer include on- 
the-job exposure to carcinogens such as asbestos and uranium, exposure to chemical hazards 

1 5 such as radon, polycyclic aromatic hydrocarbons, chromium, nickel, and inorganic arsenic, 
genetic factors, and diet. 

Histological classification of various lung cancers define the types of cancer that 
begin in the lung. See, e.g., Travis, et al. (1999) Histological Typing of L ung and Pleural 
Tumours (International Histological Classification of Tumours, No 1 . Four major cell types 

20 make up more than 88% of all primary lung neoplasms. These are: squamous or epidermoid 
carcinoma, small cell (also called oat cell) carcinoma, adenocarcinoma, and large cell (also 
called large cell anaplastic) carcinoma. The remainder include undifferentiated carcinomas, 
carcinoids, bronchial gland tumors, and other rarer types. The various cell types have 
different natural histories and responses to therapy, and, thus, a correct histologic diagnosis is 

25 the first step of effective treatment. 

Small cell lung cancer (SCLC) accounts for 18-25% of all lung cancers, and occurs 
less frequently than non-small cell lung cancers, and generally spread to distant organs more 
rapidly than non-small cell lung cancer. In general, at the time of presentation small cell lung 
cancers have already spread beyond the beyond the bounds where surgery and curative intent 

30 can be undertaken. Hoever, if identified early enough, these cancers are often responsive to 
chemotherapy and thoracic radiation treatment. 

Non-small cell lung cancers (NSCLC) are the more frequently occurring form of lung 
cancer. They comprise squamous cell carcinoma, adenocarcinoma, and large cell carcinoma 
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and account for more than 75% of all lung cancers. Non-small cell tumors that are localized 
at the time of presentation can sometimes be cured with surgery and/or radiotherapy, but 
usually are not identified until significant metastasis has occurred, which are typically not 
very responsive to surgical, chemotherapy, or radiation treatment. 
5 The screening of asymptomatic persons at high risk for lung cancer has often proven 

ineffective. In general, only 5 to 15 percent of lung cancer patients have their disease 
detected while they are asymptomatic. Of course, early detection and treatment are critical 
factors in the fight against lung cancer. The average survival rate is 49% for those whose 
cancer is detected early, before the cancer has spread from the lung. Lung cancer often 

1 0 spreads outside of the lung, and it may have spread to the bones or brain by the time it is 
diagnosed. While the prognosis may be better for lung cancers that are detected early, 
because of the lack ofv effective curative treatments, early detection does not necessarily alter 
the total death rate from lung cancer. 

Thus, methods for diagnosis and prognosis of lung cancer and effective treatment of 

1 5 lung cancer would be desirable. Accordingly, provided herein are methods that can be used 
in diagnosis and prognosis of lung cancer. Further provided are methods that can be used to 
screen candidate therapeutic agents for the ability to modulate, e.g., treat, lung cancer. 
Additionally, provided herein are molecular targets and compositions for therapeutic 
intervention in lung disease and other metastatic cancers. 

20 

SUMMARY OF THE INVENTION 
The present invention provides nucleotide sequences of genes that are up- and down- 
regulated in lung cancer cells. Such genes are useful for diagnostic purposes, and also as 
targets for screening for therapeutic compounds that modulate lung cancer, such as 

25 antibodies. The methods of detecting nucleic acids of the invention or their encoded proteins 
can be used for a number of purposes. Examples include early detection of lung cancers, 
monitoring and early detection of relapse following treatment of lung cancers, monitoring 
response to therapy of lung cancers, deterrnining prognosis of lung cancers, directing therapy 
of lung cancers, selecting patients for postoperative chemotherapy or radiation therapy, 

30 selecting therapy, determining tumor prognosis, treatment, or response to treatment, and early 
detection of precancerous lesions of the lung. Examples of benign or precancerous lesions 
include: atelectasis, emphysema, brochitis, chronic obstructive pulmonary disease, fibrosis, 
hypersensitivity pneumonitis (HP), interstitial pulmonary fibrosis (IPF), asthma, and 
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bronchiectasis. Other aspects of the invention will become apparent to the skilled artisan by 
the following description of the invention. 

In one aspect, the present invention provides a method of detecting a lung cancer- 
associated transcript in a cell from a patient, the method comprising contacting a biological 
5 sample from the patient with a polynucleotide that selectively hybridizes to a sequence at 
least 80% identical to a sequence as shown in Tables 1A-16. Alternatively, the sample may 
be contacted with a specific binding reagent, e.g., antibody. 

In one embodiment, the polynucleotide selectively hybridizes to a sequence at least 
95% identical to a sequence as shown in Tables 1 A- 16. In another embodiment, the 
10 polynucleotide comprises a sequence as shown in Tables 1A-16. 

In one embodiment, the biological sample is a tissue sample, or a body fluid. In 
another embodiment, the biological sample comprises isolated nucleic acids, e.g., mRNA. 

In one embodiment, the polynucleotide is labeled, e.g., with a fluorescent label. In 
one embodiment, the polynucleotide is immobilized on a solid surface. In one embodiment, 
15 the patient is undergoing a therapeutic regimen to treat lung cancer. In another embodiment, 
the patient is suspected of having lung cancer. In one embodiment, the patient is a primate, 
e.g., a human. 

In one embodiment, the method further comprises the step of amplifying nucleic acids 
before the step of contacting the biological sample with the polynucleotide. 

20 In another aspect, the present invention provides a method of monitoring the efficacy 

of a therapeutic treatment of lung cancer, the method comprising the steps of: (i) providing a 
biological sample from a patient undergoing the therapeutic treatment; and (ii) determining 
the level of a lung cancer-associated transcript in the biological sample by contacting the 
biological sample with a polynucleotide that selectively hybridizes to a sequence at least 80% 

25 identical to a sequence as shown in Tables 1A-16, thereby monitoring the efficacy of the 
therapy. Or the sample may be evaluated for protein, e.g., contacting the sample with an 
antibody. 

In one embodiment, the method further comprises the step of: (iii) comparing the 
level of the lung cancer-associated transcript to a level of the lung cancer-associated 
30 transcript in a biological sample from the patient prior to, or earlier in, the therapeutic 
treatment. Or the sample may be evalated for comparison of protein. 

In another aspect, the present invention provides a method of monitoring the efficacy 
of a therapeutic treatment of lung cancer, the method comprising the steps of: (i) providing a 
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biological sample from a patient undergoing the therapeutic treatment; and (ii) determining 
the level of a lung cancer-associated antibody in the biological sample by contacting the 
biological sample with a polypeptide encoded by a polynucleotide that selectively hybridizes 
to a sequence at least 80% identical to a sequence as shown in Tables 1A-16, wherein the 
5 polypeptide specifically binds to the lung cancer-associated antibody, thereby monitoring the 
efficacy of the therapy. 

In one embodiment, the method further comprises the step of: (iii) comparing the 
level of the lung cancer-associated antibody to a level of the lung cancer-associated antibody 
in a biological sample from the patient prior to, or earlier in, the therapeutic treatment. 

10 In another aspect, the present invention provides a method of monitoring the efficacy 

of a therapeutic treatment of lung cancer, the method comprising the steps of: (i) providing a 
biological sample from a patient undergoing the therapeutic treatment; and (ii) determining 
the level of a lung cancer-associated polypeptide in the biological sample by contacting the 
biological sample with an antibody, wherein the antibody specifically binds to a polypeptide 

15 encoded by a polynucleotide that selectively hybridizes to a sequence at least 80% identical 
to a sequence as shown in Tables 1A-16, thereby monitoring the efficacy of the therapy. 

In one embodiment, the method further comprises the step of: (iii) comparing the 
level of the lung cancer-associated polypeptide to a level of the lung cancer-associated 
polypeptide in a biological sample from the patient prior to, or earlier in, the therapeutic 

20 treatment. In one aspect, the present invention provides an isolated nucleic acid molecule 
consisting of a polynucleotide sequence as shown in Tables 1A-16. In one embodiment, an 
expression vector or cell comprises the isolated nucleic acid. In one aspect, the present 
invention provides an isolated polypeptide which is encoded by a nucleic acid molecule 
having polynucleotide sequence as shown in Tables 1A-16. 

25 In another aspect, the present invention provides an antibody that specifically binds to 

an isolated polypeptide which is encoded by a nucleic acid molecule having polynucleotide 
sequence as shown in Tables 1A-16. In one embodiment, the antibody is conjugated to an 
effector component, e.g., a fluorescent label, a radioisotope or a cytotoxic chemical. In one 
embodiment, the antibody is an antibody fragment. In another embodiment, the antibody is 

30 humanized. 

In one aspect, the present invention provides a method of detecting lung cancer in a a 
patient, the method comprising contacting a biological sample from the patient with an 
antibody or protein as described herein. 
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In another aspect, the present invention provides a method of detecting antibodies 
specific to a lung cancer gene in a patient, the method comprising contacting a biological 
sample from the patient with a polypeptide encoded by a nucleic acid comprises a sequence 
from Tables 1A-16. 

5 In another aspect, the present invention provides a method for identifying a compound 

that modulates a lung cancer-associated polypeptide, the method comprising the steps of: (i) 
contacting the compound with a lung cancer-associated polypeptide, the polypeptide encoded 
by a polynucleotide that selectively hybridizes to a sequence at least 80% identical to a 
sequence as shown in Tables 1 A-16; and (ii) determining the functional effect of the 

1 0 compound upon the polypeptide. 

In one embodiment, the functional effect is a physical effect, an enzymatic effect, or a 
chemical effect. In one embodiment, the polypeptide is expressed in a eukaryotic host cell or 
cell membrane. In another embodiment, the polypeptide is recombinant. In one 
embodiment, the functional effect is determined by measuring ligand binding to the 

15 polypeptide. 

In another aspect, the present invention provides a method of inhibiting proliferation 
or another critical process of a lung cancer-associated cell to treat lung cancer in a patient, the 
method comprising the step of administering to the subject a therapeutically effective amount 
of a compound identified as described herein. In one embodiment, the compound is an 
20 antibody. 

In another aspect, the present invention provides a drug screening assay comprising 
the steps of: (i) administering a test compound to a mammal having lung cancer or a cell 
isolated therefrom; (ii) comparing the level of gene expression of a polynucleotide that 
selectively hybridizes to a sequence at least 80% identical to a sequence as shown in Tables 
25 1 A-l 6 in a treated cell or mammal with the level of gene expression of the polynucleotide in 
a control cell or mammal, wherein a test compound that modulates the level of expression of 
the polynucleotide is a candidate for the treatment of lung cancer. 

In one embodiment, the control is a mammal with lung cancer or a cell therefrom that 
has not been treated with the test compound. In another embodiment, the control is a normal 
30 cell or mammal, or a non-malignant lung disease. 

In another aspect, the present invention provides a method for treating a mammal 
having lung cancer comprising administering a compound identified by the assay described 
herein. 

6 
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In another aspect, the present invention provides a pharmaceutical composition for 
treating a mammal having lung cancer, the composition comprising a compound identified by 
the assay described herein and a physiologically acceptable excipient. 

5 DETAILED DESCRIPTION OF THE INVENTION 

In accordance with the objects outlined above, the present invention provides novel 
methods for diagnosis and treatment of lung disease or cancer, as well as methods for 
screening for compositions which modulate lung cancer. "Treatment, monitoring, detection 
or modulation of lung disease or cancer" includes treatment, monitoring, detection, or 

10 modulation of lung disease in those patients who have lung disease (whether malignant or 
non-malignant, e.g., emphysema, bronchitis, or fibrosis) as well as patients with lung cancers 
in which gene expression from a gene in Tables 1 A- 16 is increased or decreased, indicating 
that the subject is more likely to have disease. In particular,while these targets are identified 
primarily from lung cancer samples, these same targets are likely to be similarly found in 

1 5 analyses of other medical conditions. These other conditions may result from similar 
pathological processes which affect similar tissues, e.g., lung cancer, small cell lung 
carcinoma (oat cell carcinoma), non-small cell carcinomas (e.g., squamous cell carcinoma, 
adenocarcinoma, large cell lung carcinoma, carcinoid, granulomatous), fibrosis (idiopathic 
pulmonary fibrosis (IPF), hypersensitivity pneumonitis (HP), interstitial pneumonitis, 

20 nonspecific idiopathic pneumonitis (NSIP)), chronic obstructive pulmonary disease (COPD, 
e.g., emphysema, chronic bronchitis), asthma, bronchiectasis, and esophageal cancer. See, 
e.g., the NCI webpage and USSN 60/347,349 and USSN 60/xxx,xxx (docket LFBR-001-1P, 
filed March 29, 2002), each of which is incorporated herein by reference. The treatment may 
be of lung cancer or related condition itself, or treatment of metastasis. 

25 In particular, identification of markers selectively expressed on these cancers allows 

for use of that expression in diagnostic, prognostic, or therapeutic methods. As such, the 
invention defines various compositions, e.g., nucleic acids, polypeptides, antibodies, and 
small molecule agonists/antagonists, which will be useful to selectively identify those 
markers. For example, therapeutic methods may take the form of protein therapeutics which 

30 use the marker expression for selective localization or modulation of function (for those 
markers which have a causative disease effect), for vaccines, identification of binding 
partners, or antagonism, e.g., using antisense or RNAi. The markers may be useful for 
molecular characterization of subsets of lung diseases, which subsets may actually require 
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very different treatments. Moreover, the markers may also be important in related diseases to 
the specific cancers, e.g., which affect similar tissues in non-malignant diseases, or have 
similar mechanisms of induction/maintenance. Metastatic processes or characteristics may 
also be targeted. Diagnostic and prognostic uses are made available, e.g., to subset related 
5 but distinct diseases, or to determine treatment strategy. The detection methods may be based 
upon nucleic acid, e.g., PCR or hybridization techniques, or protein, e.g., ELISA, imaging, 
IHC, etc. The diagnosis may be qualitative or quantitative, and may detect increases or 
decreases in expression levels. 

Tables 1A-16 provide unigene cluster identification numbers for the nucleotide 

10 sequence of genes that exhibit increased or decreased expression in lung cancer samples. The 
tables also provide an exemplar accession number that provides a nucleotide sequence that is 
part of the unigene cluster. In Table 1 A, genes marked as "target 1" or "target 2" are 
particularly useful as therapeutic targets. Genes marked as "target 3" are particularly useful 
as diagnostic markers. Genes marked as "chron" are upregulated in chronically diseased lung 

1 5 (e.g., emphysema, bronchitis, fibrosis) relative to lung tumors and normal tissue. In certain 
analyses, the ratio for the "chron" category was determined using the 70th percentile of 
chronically diseases lung samples divided by the 90th percentile of normal lung samples. 
The ratio for the targets was determined using the 70th percentile of lung tumor samples 
divided by the 90th percentile of normal lung samples. 

20 

Definitions 

The term "lung cancer protein" or "lung cancer polynucleotide" or "lung cancer- 
associated transcript" refers to nucleic acid and polypeptide polymorphic variants, alleles, 
mutants, and interspecies homologs that: (1) have a nucleotide sequence that has greater than 

25 about 60% nucleotide sequence identity, 65%, 70%, 75%, 80%, 85%, 90%, preferably 91%, 
92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% or greater nucleotide sequence identity, 
preferably over a region of over a region of at least about 25, 50, 100, 200, 500, 1000, or 
more nucleotides, to a nucleotide sequence of or associated with a unigene cluster of Tables 
1A-16; (2) bind to antibodies, e.g., polyclonal antibodies, raised against an immunogen 

30 comprising an amino acid sequence encoded by a nucleotide sequence of or associated with a 
unigene cluster of Tables 1A-16, and conservatively modified variants thereof; (3) 
specifically hybridize under stringent hybridization conditions to a nucleic acid sequence, or 
the complement thereof of Tables 1A-16 and conservatively modified variants thereof; or (4) 

8 
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have an amino acid sequence that has greater than about 60% amino acid sequence identity, 
65%, 70%, 75%, 80%, 85%, 90%, preferably 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 
or 99% or greater amino sequence identity, preferably over a region of over a region of at 
least about 25, 50, 100, 200, 500, 1000, or more amino acid, to an amino acid sequence 
5 encoded by a nucleotide sequence of or associated with a unigene cluster of Tables 1 A-l 6. A 
polynucleotide or polypeptide sequence is typically from a mammal including, but not 
limited to, primate, e.g., human; rodent, e.g., rat, mouse, hamster; cow, pig, horse, sheep, or 
other mammal. A "lung cancer polypeptide" and a "lung cancer polynucleotide," include 
both naturally occurring or recombinant forms. 

10 A "full length" lung cancer protein or nucleic acid refers to a lung cancer polypeptide 

or polynucleotide sequence, or a variant thereof, that contains the elements normally 
contained in one or more naturally occurring, wild type lung cancer polynucleotide or 
polypeptide sequences. The "full length" may be prior to, or after, various stages of post- 
translational processing or splicing, including alternative splicing. 

1 5 "Biological sample" as used herein is a sample of biological tissue or fluid that 

contains nucleic acids or polypeptides, e.g., of a lung cancer protein, polynucleotide, or 
transcript. Such samples include, but are not limited to, tissue isolated from primates, e.g., 
humans, or rodents, e.g., mice, and rats. Biological samples may also include sections of 
tissues such as biopsy and autopsy samples, frozen sections taken for histologic purposes, 

20 archival materials, blood, plasma, serum, sputum, stool, tears, mucus, hair, skin, etc. 
Biological samples also include explants and primary and/or transformed cell cultures 
derived from patient tissues. A biological sample is typically obtained from a eukaryotic 
organism, most preferably a mammal such as a primate, e.g., chimpanzee or human; cow; 
dog; cat; a rodent, e.g., guinea pig, rat, mouse; rabbit; or other mammal; or a bird; reptile; 

25 fish. Livestock and domestic animals are of interest. 

"Providing a biological sample" means to obtain a biological sample for use in 
methods described in this invention. Most often, tins will be done by removing a sample of 
cells from an animal, but can also be accomplished by using previously isolated cells (e.g., 
isolated by another person, at another time, and/or for another purpose), or by performing the 

30 methods of the invention in vivo. Archival tissues or materials, having treatment or outcome 
history, will be particularly useful. 

The terms "identical" or percent "identity," in the context of two or more nucleic 
acids or polypeptide sequences, refer to two or more sequences or subsequences that are the 
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same or have a specified percentage of amino acid residues or nucleotides that are the same 
(e.g., about 60% identity, preferably 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 
95%, 96%, 97%, 98%, 99%, or higher identity over a specified region, when compared and 
aligned for maximum correspondence over a comparison window or designated region) as 
5 measured using, e.g., a BLAST or BLAST 2.0 sequence comparison algorithms with default 
parameters described below, or by manual alignment and visual inspection (see, e.g., NCBI 
web site http://www.ncbi.nlmnih.gov/BLAST/ or the like). Such sequences are then said to 
be "substantially identical." This definition also refers to, or may be applied to, the 
complement of a test sequence. The definition also includes sequences that have deletions 
10 and/or insertions, substitutions, and naturally occurring, e.g., polymorphic or allelic variants, 
and man-made variants. As described below, the preferred algorithms can account for gaps 
and the like. Preferably, identity exists over a region that is at least about 25 amino acids or 
nucleotides in length, or more preferably over a region that is 50-100 amino acids or 
nucleotides in length. 

15 For sequence comparison, typically one sequence acts as a reference sequence, to 

which test sequences are compared. When using a sequence comparison algorithm, test and 
reference sequences are entered into a computer, subsequence coordinates are designated, if 
necessary, and sequence algorithm program parameters are designated. Preferably, default 
program parameters can be used, or alternative parameters can be designated. The sequence 

20 comparison algorithm then calculates the percent sequence identities for the test sequences 
relative to the reference sequence, based on the program parameters. 

A "comparison window", as used herein, includes reference to a segment of 
contiguous positions selected from the group consisting typically of from 20 to 600, usually 
about 50 to about 200, more usually about 100 to about 1 50 in which a sequence may be 

25 compared to a reference sequence of the same number of contiguous positions after the two 
sequences are optimally aligned. Methods of alignment of sequences for comparison are 
well-known in the art. Optimal alignment of sequences for comparison can be conducted, 
e.g., by the local homology algorithm of Smith and Waterman (1981) Adv. Appl. Math. 
2:482, by the homology alignment algorithm of Needleman and Wunsch (1970) J. Mol. Biol. 

30 48 :443, by the search for similarity method of Pearson and Lipman (1 988) Proc. Nat'l. Acad. 
Sci. USA 85:2444, by computerized implementations of these algorithms (GAP, BESTFIT, 
FASTA, and TFASTA in the Wisconsin Genetics Software Package, Genetics Computer 
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Group, 575 Science Dr., Madison, WI), or by manual alignment and visual inspection (see, 

e.g., Ausubel, et al. (eds. 1995 and supplements) Current Protocols in Mole cular Bioloev. 

Preferred examples of algorithms that are suitable for determining percent sequence 

identity and sequence similarity include the BLAST and BLAST 2.0 algorithms, which are 

5 described in Altschul, et al. (1977) Nuc. Acids Res. 25:3389-3402 and Altschul, et al. (1990) 

J. Mol. Biol. 215:403-410. BLAST and BLAST 2.0 are used, with the parameters described 

herein, to determine percent sequence identity for the nucleic acids and proteins of the 

invention. Software for performing BLAST analyses is publicly available through the 

National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov/). This 

1 0 algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short 
words of length W in the query sequence, which either match or satisfy some positive-valued 
threshold score T when aligned with a word of the same length in a database sequence. T is 
referred to as the neighborhood word score threshold (Altschul, et al., supra). These initial 
neighborhood word hits act as seeds for initiating searches to find longer HSPs containing 

15 them. The word hits are extended in both directions along each sequence for as far as the 
cumulative alignment score can be increased. Cumulative scores are calculated using, e.g., 
for nucleotide sequences, the parameters M (reward score for a pair of matching residues; 
always > 0) and N (penalty score for mismatching residues; always < 0). For amino acid 
sequences, a scoring matrix is used to calculate the cumulative score. Extension of the word 

20 hits in each direction are halted when: the cumulative alignment score falls off by the 

quantity X from its maximum achieved value; the cumulative score goes to zero or below, 
due to the accumulation of one or more negative-scoring residue alignments; or the end of 
either sequence is reached. The BLAST algorithm parameters W, T, and X determine the 
sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) 

25 uses as defaults a wordlength (W) of 1 1, an expectation (E) of 10, M=5, N=-4 and a 
comparison of both strands. For amino acid sequences, the BLASTP program uses as 
defaults a wordlength of 3, and expectation (E) of 10, and the BLOSUM62 scoring matrix 
(see Henikoff and Henikoff (1989) Proc. Natl. Acad. Sci. USA 89:10915) alignments (B) of 
50, expectation (E) of 10, M=5, N=-4, and a comparison of both strands. 

30 The BLAST algorithm also performs a statistical analysis of the similarity between 

two sequences (see, e.g., Karlin and Altschul (1993) Proc. Nat'l. Acad. Sci. USA 90:5873- 
5787). One measure of similarity provided by the BLAST algorithm is the smallest sum 
probability (P(N))> which provides an indication of the probability by which a match between 
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two nucleotide or amino acid sequences would occur by chance. For example, a nucleic acid 
is considered similar to a reference sequence if the smallest sum probability in a comparison 
of the test nucleic acid to the reference nucleic acid is less than about 0.2, more preferably 
less than about 0.01, and most preferably less than about 0.001. Log values maybe negative 
5 large numbers, e.g., 5, 10, 20, 30, 40, 40, 70, 90, 110, 150, 170, etc. 

An indication that two nucleic acid sequences are substantially identical is that the 
polypeptide encoded by the first nucleic acid is immunologically cross reactive with the 
antibodies raised against the polypeptide encoded by the second nucleic acid. Thus, a 
polypeptide is typically substantially identical to a second polypeptide, e.g., where the two 
10 peptides differ only by conservative substitutions. Another indication that two nucleic acid 
sequences are substantially identical is that the two molecules or their complements hybridize 
to each other under stringent conditions. Yet another indication that two nucleic acid 
sequences are substantially identical is that the same primers can be used to amplify the 
sequences. 

15 A "host cell" is a naturally occurring cell or a transformed cell that contains an 

expression vector and supports the replication or expression of the expression vector. Host 
cells may be cultured cells, explants, cells in vivo, and the like. Host cells may be 
prokaryotic cells such as E. coli, or eukaryotic cells such as yeast, insect, amphibian, or 
mammalian cells such as CHO, HeLa, and the like (see, e.g., the American Type Culture 

20 Collection catalog or web site, www.atcc.org). 

The terms "isolated," "purified," or "biologically pure" refer to material that is 
substantially or essentially free from components that normally accompany it as found in its 
native state. Purity and homogeneity are typically determined using analytical chemistry 
techniques such as polyacrylamide gel electrophoresis or high performance liquid 

25 chromatography. A protein or nucleic acid that is the predominant species present in a 

preparation is substantially purified. In particular, an isolated nucleic acid is separated from 
some open reading frames that naturally flank the gene and encode proteins other than protein 
encoded by the gene. The term "purified" in some embodiments denotes that a nucleic acid 
or protein gives rise to essentially one band in an electrophoretic gel. Preferably, it means 

30 that the nucleic acid or protein is at least about 85% pure, more preferably at least 95% pure, 
and most preferably at least 99% pure. "Purify" or "purification" in other embodiments 
means removing at least one contaminant or component from the composition to be purified. 
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In this sense, purification does not require that the purified compound be homogeneous, e.g., 
100% pure. 

The terms "polypeptide," "peptide" and "protein" are used interchangeably herein to 
refer to a polymer of amino acid residues. The terms apply to amino acid polymers in which 
5 one or more amino acid residue is an artificial chemical mimetic of a corresponding naturally 
occurring amino acid, as well as to naturally occurring amino acid polymers, those containing 
modified residues, and non-naturally occurring amino acid polymer. 

The term "amino acid" refers to naturally occurring and synthetic amino acids, as well 
as amino acid analogs and amino acid mimetics that function similarly to the naturally 

10 occurring amino acids. Naturally occurring amino acids are those encoded by the genetic 
code, as well as those amino acids that are later modified, e.g., hydroxyproline, y- 
carboxyglutamate, and O-phosphoserine. Amino acid analogs refer to compounds that have 
the same basic chemical structure as a naturally occurring amino acid, e.g., an a carbon that is 
bound to a hydrogen, a carboxyl group, an amino group, and an R group, e.g., homoserine, 

15 norleucine, methionine sulfoxide, methionine methyl sulfonium. Such analogs may have 
modified R groups (e.g., norleucine) or modified peptide backbones, but retain some basic 
chemical structure as a naturally occurring amino acid. Amino acid mimetics refer to 
chemical compounds that have a structure that is different from the general chemical 
structure of an amino acid, but that function similarly to another amino acid. 

20 Amino acids may be referred to herein by either their commonly known three letter 

symbols or by the one-letter symbols recommended by the IUPAC-IUB Biochemical 
Nomenclature Commission. Nucleotides, likewise, may be referred to by their commonly 
accepted single-letter codes. 

"Conservatively modified variants" applies to both amino acid and nucleic acid 

25 sequences. With respect to particular nucleic acid sequences, conservatively modified 
variants refers to those nucleic acids which encode identical or essentially identical amino 
acid sequences, or where the nucleic acid does not encode an amino acid sequence, to 
essentially identical or associated, e.g., naturally contiguous, sequences. Because of the 
degeneracy of the genetic code, a large number of functionally identical nucleic acids encode 

30 most proteins. For instance, the codons GCA, GCC, GCG, and GCU each encode the amino 
acid alanine. Thus, at each position where an alanine is specified by a codon, the codon can 
be altered to another of the corresponding codons described without altering the encoded 
polypeptide. Such nucleic acid variations are "silent variations," which are one species of 
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conservatively modified variations. Every nucleic acid sequence Herein which encodes a 
polypeptide also describes silent variations of the nucleic acid. In certain contexts each 
codon in a nucleic acid (except AUG, which is ordinarily the only codon for methionine, and 
TGG, which is ordinarily the only codon for tryptophan) can be modified to yield a 
5 functionally similar molecule. Accordingly, a silent variation of a nucleic acid which 
encodes a polypeptide is implicit in a described sequence with respect to the expression 
product, but not necessarily with respect to actual probe sequences. 

As to amino acid sequences, one of skill will recognize that individual substitutions, 
deletions or additions to a nucleic acid, peptide, polypeptide, or protein sequence which 

10 alters, adds or deletes a single amino acid or a small percentage of amino acids in the encoded 
sequence is a "conservatively modified variant" where the alteration results in the substitution 
of an amino acid with a chemically similar amino acid. Conservative substitution tables 
providing functionally similar amino acids are well known in the art. Such conservatively 
modified variants are in addition to and do not exclude polymorphic variants, interspecies 

15 homologs, and alleles of the invention. Typically conservative substitutions include for one 
another: 1) Alanine (A), Glycine (G); 2) Aspartic acid (D), Glutamic acid (E); 3) Asparagine 
(N), Glutamine (Q); 4) Arginine (R), Lysine (K); 5) Isoleucine (I), Leucine (L), Methionine 
(M), Valine (V); 6) Phenylalanine (F), Tyrosine (Y), Tryptophan (W); 7) Serine (S), 
Threonine (T); and 8) Cysteine (C), Methionine (M) (see, e.g., Creighton, Proteins (1984)). 

20 Macromolecular structures such as polypeptide structures can be described in terms of 

various levels of organization. For a general discussion of this organization, see, e.g., 
Alberts, et al. (1994) Molecular Biology of the Cell (3 rd ed.) and Cantor and Schimmel (1980) 
Biophysical Chemistry Part I: The Conformation of Biological Macromolecules . "Primary 
structure" refers to the amino acid sequence of a particular peptide. "Secondary structure" 

25 refers to locally ordered, three dimensional structures within a polypeptide. These structures 
are commonly known as domains. Domains are portions of a polypeptide that often form a 
compact unit of the polypeptide and are typically 25 to approximately 500 amino acids long. 
Typical domains are made up of sections of lesser organization such as stretches of P-sheet 
and cc-helices. "Tertiary structure" refers to the complete three dimensional structure of a 

30 polypeptide monomer. "Quaternary structure" refers to the three dimensional structure 
formed, usually by the noncovalent association of independent tertiary units. Anisotropic 
terms are also known as energy terms. 
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"Nucleic acid" or "oligonucleotide" or "polynucleotide" or grammatical equivalents 
used herein means at least two nucleotides covalently linked together. Oligonucleotides are 
typically from about 5, 6, 7, 8, 9, 10, 12, 15, 25, 30, 40, 50 or more nucleotides in length, up 
to about 100 nucleotides in length. Nucleic acids and polynucleotides are a polymers of any 
5 length, including longer lengths, e.g., 200, 300, 500, 1000, 2000, 3000, 5000, 7000, 10,000, 
etc. A nucleic acid of the present invention will generally contain phosphodiester bonds, 
although in some cases, nucleic acid analogs are included that may have at least one different 
linkage, e.g., phosphoramidate, phosphorothioate, phosphorodithioate, or O- 
methylphophoroamidite linkages (see Eckstein (1992) Oligonucleotides and Analogues: A 

10 Practical Approach Oxford University Press); and peptide nucleic acid backbones and 
linkages. Other analog nucleic acids include those with positive backbones; non-ionic 
backbones, and non-ribose backbones, including those described in U.S. Patent Nos. 
5,235,033 and 5,034,506, and Chapters 6 and 7, in Sanghui and Cook, eds. Carbohydrate 
Modifications in Antisense Research. ASC Symposium Series 580. Nucleic acids containing 

15 one or more carbocyclic sugars are also included within one definition of nucleic acids. 

Modifications of the ribose-phosphate backbone may be done for a variety of reasons, e.g., to 
increase the stability and half-life of such molecules in physiological environments or as 
probes on a biochip. Mixtures of naturally occurring nucleic acids and analogs can be made; 
alternatively, mixtures of different nucleic acid analogs, and mixtures of naturally occurring 

20 nucleic acids and analogs may be made. 

Particularly preferred are peptide nucleic acids (PNA) which includes peptide nucleic 
acid analogs. These backbones are substantially non-ionic under neutral conditions, in 
contrast to the highly charged phosphodiester backbone of naturally occurring nucleic acids. 
This results in two advantages. First, the PNA backbone exhibits improved hybridization 

25 kinetics. PNAs have larger changes in the melting temperature (T m ) for mismatched versus 
perfectly matched basepairs. DNA and RNA typically exhibit a 2-4° C drop in T m for an 
internal mismatch. With the non-ionic PNA backbone, the drop is closer to 7-9° C. 
Similarly, due to their non-ionic nature, hybridization of the bases attached to these 
backbones is relatively insensitive to salt concentration. In addition, PNAs are not degraded 

30 by cellular enzymes, and thus can be more stable. 

The nucleic acids may be single stranded or double stranded, as specified, or contain 
portions of both double stranded or single stranded sequence. As will be appreciated by those 
in the art, the depiction of a single strand also defines the sequence of the complementary 
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strand; thus the sequences described herein also provide the complement of the sequence. 
The nucleic acid may be DNA, both genomic and cDNA, RNA, or a hybrid, where the 
nucleic acid may contain combinations of deoxyribo- and ribo-nucleotides, and combinations 
of bases, including uracil, adenine, thymine, cytosine, guanine, inosine, xanthine 
5 hypoxanthine, isocytosine, isoguanine, etc. "Transcript" typically refers to a naturally 
occurring RNA, e.g., a pre-mRNA, hnRNA, or mRNA. As used herein, the term 
"nucleoside" includes nucleotides and nucleoside and nucleotide analogs, and modified 
nucleosides such as amino modified nucleosides. In addition, "nucleoside" includes non- 
naturally occurring analog structures. Thus, e.g., the individual units of a peptide nucleic 

10 acid, each containing a base, are referred to herein as a nucleoside. 

A "label" or a "detectable moiety" is a composition detectable by spectroscopic, 
photochemical, biochemical, immunochemical, physiological, chemical, or other physical 
means. For example, useful labels include 32 P, fluorescent dyes, electron-dense reagents, 
enzymes (e.g., as commonly used in an ELISA), biotin, digoxigenin, or haptens and proteins 

15 or other entities which can be made detectable, e.g., by incorporating a radiolabel into the 
peptide or used to detect antibodies specifically reactive with the peptide. The labels may be 
incorporated into the cancer nucleic acids, proteins, and antibodies. Many methods known in 
the art for conjugating the antibody to the label may be employed, including those methods 
described by Hunter, et al. (1962) Nature 144:945; David, et al. (1974) Biochemistry 

20 13:1014-1021; Pain, etal. (19811 J. Immunol. Metfa. . 40:219-230: and Nygren (1982) L 
Histochem. and Cvtochem. 30:407-412. 

An "effector" or "effector moiety" or "effector component" is a molecule that is 
bound (or linked, or conjugated), either covalently, through a linker or a chemical bond, or 
noncovalently, through ionic, van der Waals, electrostatic, or hydrogen bonds, to an antibody. 

25 The "effector" can be a variety of molecules including, e.g., detection moieties including 

radioactive compounds, fluorescent compounds, an enzyme or substrate, tags such as epitope 
tags, a toxin; activatable moieties, a chemotherapeutic agent; a lipase; an antibiotic; or a 
radioisotope emitting "hard" e.g., beta radiation. 

A "labeled nucleic acid probe or oligonucleotide" is one that is bound, either 

30 covalently, through a linker or a chemical bond, or noncovalently, through ionic, van der 

Waals, electrostatic, or hydrogen bonds to a label such that the presence of the probe may be 
detected by detecting the presence of the label bound to the probe. Alternatively, method 



16 



WO 02/086443 PCT/US02/12476 

using high affinity interactions may achieve the same results where one of a pair of binding 
partners binds to the other, e.g., biotin, streptavidin. 

As used herein a "nucleic acid probe or oligonucleotide" is a nucleic acid capable of 
binding to a target nucleic acid of complementary sequence through one or more types of 
5 chemical bonds, usually through complementary base pairing, e.g., through hydrogen bond 
formation. As used herein, a probe may include natural (i.e., A, G, C, or T) or modified bases 
(7-deazaguanosine, inosine, etc.). In addition, the bases in a probe may be joined by a 
linkage other than a phosphodiester bond, preferably one that does not functionally interfere 
with hybridization. Thus, e.g., probes may be peptide nucleic acids in which the constituent 

1 0 bases are joined by peptide bonds rather than phosphodiester linkages . Probes may bind 

target sequences lacking complete complementarity with the probe sequence depending upon 
the stringency of the hybridization conditions. The probes are preferably directly labeled, 
e.g., with isotopes, chromophores, lumiphores, chromogens, or indirectly labeled, e.g., with 
biotin to which a streptavidin complex may later bind. By assaying for the presence or 

1 5 absence of the probe, one can detect the presence or absence of the select sequence or 

subsequence. Diagnosis or prognosis may be based at the genomic level, or at the level of 
RNA or protein expression. 

The term "recombinant" when used with reference, e.g., to a cell, or nucleic acid, 
protein, or vector, indicates that the cell, nucleic acid, protein or vector, has been modified by 

20 the introduction of a heterologous nucleic acid or protein or the alteration of a native nucleic 
acid or protein, or that the cell is derived from a cell so modified. Thus, e.g., recombinant 
cells express genes that are not found within the native (non-recombinant) form of the cell or 
express native genes that are otherwise abnormally expressed, under expressed or not 
expressed at all. By the term "recombinant nucleic acid" herein is meant nucleic acid, 

25 originally formed in vitro, in general, by the manipulation of nucleic acid, e.g., using 
polymerases and endonucleases, in a form not normally found in nature. In this manner, 
operably linkage of different sequences is achieved. Thus an isolated nucleic acid, in a linear 
form, or an expression vector formed in vitro by ligating DNA molecules that are not 
normally joined, are both considered recombinant for the purposes of this invention. It is 

30 understood that once a recombinant nucleic acid is made and reintroduced into a host cell or 
organism, it will replicate non-recombinantly, i.e., using the in vivo cellular machinery of the 
host cell rather than in vitro manipulations; however, such nucleic acids, once produced 
recombinantly, although subsequently replicated non-recombinantly, are still considered 
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recombinant for the purposes of the invention. Similarly, a "recombinant protein" is a protein 
made using recombinant techniques, i.e., through the expression of a recombinant nucleic 
acid as depicted above. 

The term "heterologous" when used with reference to portions of a nucleic acid 
5 indicates that the nucleic acid comprises two or more subsequences that are not normally 
found in the same relationship to each other in nature. For instance, the nucleic acid is 
typically recombinantly produced, having two or more sequences, e.g., from unrelated genes 
arranged to make a new functional nucleic acid, e.g., a promoter from one source and a 
coding region from another source. Similarly, a heterologous protein will often refer to two 
10 or more subsequences that are not found in the same relationship to each other in nature (e.g., 
a fusion protein). 

A "promoter" is typically an array of nucleic acid control sequences that direct 
transcription of a nucleic acid. As used herein, a promoter includes necessary nucleic acid 
sequences near the start site of transcription, such as, in the case of a polymerase II type 

1 5 promoter, a TATA element. A promoter also optionally includes distal enhancer or repressor 
elements, which can be located as much as several thousand base pairs from the start site of 
transcription. A "constitutive" promoter is a promoter that is active under most 
environmental and developmental conditions. An "inducible" promoter is a promoter that is 
active under environmental or developmental regulation. The term "operably linked" refers 

20 to a functional linkage between a nucleic acid expression control sequence (such as a 

promoter, or array of transcription factor binding sites) and a second nucleic acid sequence, 
e.g., wherein the expression control sequence directs transcription of the nucleic acid 
corresponding to the second sequence. 

An "expression vector" is a nucleic acid construct, generated recombinantly or 

25 synthetically, with a series of specified nucleic acid elements that permit transcription of a 
particular nucleic acid in a host cell. The expression vector can be part of a plasmid, virus, or 
nucleic acid fragment. Typically, the expression vector includes a nucleic acid to be 
transcribed in operable linkage to a promoter. 

The phrase "selectively (or specifically) hybridizes to" refers to the binding, 

30 duplexing, or hybridizing of a molecule selectively to a particular nucleotide sequence under 
stringent hybridization conditions when that sequence is present in a complex mixture (e.g., 
total cellular or library DNA or RNA). 



18 



WO 02/086443 PCT/US02/12476 

The phrase "stringent hybridization conditions" refers to conditions under which a 
probe will hybridize to its target subsequence, typically in a complex mixture of nucleic 
acids, but to essentially no other sequences. Stringent conditions are sequence-dependent and 
will be different in different circumstances. Longer sequences hybridize specifically at 
5 higher temperatures. An extensive guide to the hybridization of nucleic acids is found in 
"Overview of principles of hybridization and the strategy of nucleic acid assays" in Tijssen 
(1993) Techniques in Biochemistry and Molecular Biology-Hybridization with Nucleic 
Probes (vol. 24) Elsevier. Generally, stringent conditions are selected to be about 5-10° C 
lower than the thermal melting point (T m ) for the specific sequence at a defined ionic strength 

10 pH. The T m is the temperature (under defined ionic strength, pH, and nucleic concentration) 
at which 50% of the probes complementary to the target hybridize to the target sequence at 
equilibrium (as the target sequences are present in excess, at T m , 50% of the probes are 
occupied at equilibrium). Stringent conditions will be those in which the salt concentration is 
less than about 1 .0 M sodium ion, typically about 0.01 to 1.0 M sodium ion concentration (or 

15 other salts) at pH 7.0 to 8.3 and the temperature is at least about 30° C for short probes (e.g., 
10 to 50 nucleotides) and at least about 60° C for long probes (e.g., greater than 50 
nucleotides). Stringent conditions may also be achieved with the addition of destabilizing 
agents such as formamide. For selective or specific hybridization, a positive signal is 
typically at least two times background, preferably 10 times background hybridization. 

20 Exemplary stringent hybridization conditions are often: 50% formamide, 5x SSC, and 1% 
SDS, incubating at 42° C, or, 5x SSC, 1% SDS, incubating at 65° C, with wash in 0.2x SSC, 
and 0.1% SDS at 65° C. For PCR, a temperature of about 36° C is typical for low stringency 
amplification, although annealing temperatures may vary between about 32° C and 48° C 
depending on primer length. For high stringency PCR amplification, a temperature of about 

25 62° C is typical, although high stringency annealing temperatures can range from about 50° C 
to about 65° C, depending on the primer length and specificity. Typical cycle conditions for 
both high and low stringency amplifications include a denaturation phase of 90° C - 95° C for 
0.5 - 2 min., an annealing phase lasting 0.5 - 2 min., and an extension phase of about 72° C 
for 1 - 2 min. Protocols and guidelines for low and high stringency amplification reactions 

30 are provided, e.g., in Innis, et al.(1990) PCR Protocols. A Guide to Methods and 
Applications . 

Nucleic acids that do not hybridize to each other under stringent conditions are still 
substantially identical if the polypeptides which they encode are substantially identical. This 
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occurs, e.g., when a copy of a nucleic acid is created using the maximum codon degeneracy 
permitted by the genetic code. In such cases, the nucleic acids typically hybridize under 
moderately stringent hybridization conditions. Exemplary "moderately stringent 
hybridization conditions" include a hybridization in a buffer of 40% formamide, 1 M NaCl, 
5 1% SDS at 37° C, and a wash in IX SSC at 45° C. A positive hybridization is at least twice 
background. Alternative hybridization and wash conditions can be utilized to provide 
conditions of similar stringency. Additional guidelines for determining hybridization 
parameters are provided in numerous reference, e.g., Ausubel, et al. (ed.) Current Protocols in 

1 0 The phrase "functional effects" in the context of assays for testing compounds that 

modulate activity of a lung cancer protein includes the determination of a parameter that is 
indirectly or directly under the influence of the lung cancer protein or nucleic acid, e.g., a 
physiological, enzymatic, functional, physical, or chemical effect, such as the ability to 
decrease lung cancer. It includes ligand binding activity; cell viability, cell growth on soft 

1 5 agar; anchorage dependence; contact inhibition and density limitation of growth; cellular 
proliferation; cellular transformation; growth factor or serum dependence; tumor specific 
marker levels; invasiveness into Matrigel; tumor growth and metastasis in vivo; mRNA and 
protein expression in cells undergoing metastasis, and other characteristics of lung cancer 
cells. "Functional effects" include in vitro, in vivo, and ex vivo activities. 

20 By "determining the functional effect" is meant assaying for a compound that 

increases or decreases a parameter that is indirectly or directly under the influence of a lung 
cancer protein sequence, e.g., physiological, functional, enzymatic, physical, or chemical 
effects. Such functional effects can be measured by many means known to those skilled in 
the art, e.g., changes in spectroscopic characteristics (e.g., fluorescence, absorbance, 

25 refractive index), hydrodynamic (e.g., shape), chromatographic, or solubility properties for 
the protein, measuring inducible markers or transcriptional activation of the lung cancer 
protein; measuring binding activity or binding assays, e.g., binding to antibodies or other 
ligands, and measuring cellular proliferation. Determination of the functional effect of a 
compound on lung cancer can also be performed using lung cancer assays known to those of 

30 skill in the art such as an in vitro assays, e.g., cell growth on soft agar; anchorage 

dependence; contact inhibition and density limitation of growth; cellular proliferation; 
cellular transformation; growth factor or serum dependence; tumor specific marker levels; 
invasiveness into Matrigel; tumor growth and metastasis in vivo; mRNA and protein 
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expression in cells undergoing metastasis, and other characteristics of lung cancer cells. The 
functional effects can be evaluated by many means known to those skilled in the art, e.g., 
microscopy for quantitative or qualitative measures of alterations in morphological features, 
measurement of changes in RNA or protein levels for lung cancer-associated sequences, 
5 measurement of RNA stability, identification of downstream or reporter gene expression 
(CAT, luciferase, P-gal, GFP, and the like), e.g., via chemiluminescence, fluorescence, 
colorimetric reactions, antibody binding, inducible markers, and ligand binding assays. 

"Inhibitors", "activators", and "modulators" of lung cancer polynucleotide and 
polypeptide sequences are used to refer to activating, inhibitory, or modulating molecules or 

10 compounds identified using in vitro and in vivo assays of lung cancer polynucleotide and 

polypeptide sequences. Inhibitors are compounds that, e.g., bind to, partially or totally block 
activity, decrease, prevent, delay activation, inactivate, desensitize, or down regulate the 
activity or expression of lung cancer proteins, e.g., antagonists. Antisense or inhibitory 
nucleic acids may seem to inhibit expression and subsequent function of the protein. 

15 "Activators" are compounds that increase, open, activate, facilitate, enhance activation, 
sensitize, agonize, or up regulate lung cancer protein activity. Inhibitors, activators, or 
modulators also include genetically modified versions of lung cancer proteins, e.g., versions 
with altered activity, as well as naturally occurring and synthetic ligands, antagonists, 
agonists, antibodies, small chemical molecules and the like. Such assays for inhibitors and 

20 activators include, e.g., expressing the lung cancer protein in vitro, in cells, or cell 

membranes, applying putative modulator compounds, and then determining the functional 
effects on activity, as described above. Activators and inhibitors of lung cancer can also be 
identified by incubating lung cancer cells with the test compound and determining increases 
or decreases in the expression of 1 or more lung cancer proteins, e.g., 1, 2, 3, 4, 5, 10, 15, 20, 

25 25, 30, 40, 50 or more lung cancer proteins, such as lung cancer proteins encoded by the 
sequences set out in Tables 1A-16. 

Samples or assays comprising lung cancer proteins that are treated with a potential 
activator, inhibitor, or modulator are compared to control samples without the inhibitor, 
activator, or modulator to examine the extent of inhibition. Control samples (untreated with 

30 inhibitors) are assigned a relative protein activity value of 100%. Inhibition of a polypeptide 
is achieved when the activity value relative to the control is about 80%, preferably 50%, more 
preferably 25-0%. Activation of a lung cancer polypeptide is achieved when the activity 
value relative to the control (untreated with activators) is 1 10%, more preferably 150%, more 
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preferably 200-500% (i.e., two to five fold higher relative to the control), more preferably 
1000-3000% higher. 

The phrase "changes in cell growth" refers to any change in cell growth and 
proliferation characteristics in vitro or in vivo, such as cell viability, formation of foci, 
5 anchorage independence, semi-solid or soft agar growth, changes in contact inhibition and 
density limitation of growth, loss of growth factor or serum requirements, changes in cell 
morphology, gaining or losing immortalization, gaining or losing tumor specific markers, 
ability to form or suppress tumors when injected into suitable animal hosts, and/or 
immortalization of the cell. See, e.g., Freshney (1994) Culture of Animal Cells a Manual of 

10 Basic Technique pp. 231-241 (3 rd ed.). 

"Tumor cell" refers to precancerous, cancerous, and normal cells in a tumor. 
"Cancer cells," "transformed" cells, or "transformation" in tissue culture, refers to 
spontaneous or induced phenotypic changes that do not necessarily involve the uptake of new 
genetic material. Although transformation can arise from infection with a transfomung virus 

15 and incorporation of new genomic DNA, or uptake of exogenous DNA, it can also arise 

spontaneously or following exposure to a carcinogen, thereby mutating an endogenous gene. 
Transformation is associated with phenotypic changes, such as immortalization of cells, 
aberrant growth control, nonmorphological changes, and/or malignancy (see, Freshney 
(1994) Culture of Animal Cells a Manual of Basic Technique (3 rd ed.)). 

20 "Antibody" refers to a polypeptide comprising a framework region from an 

immunoglobulin gene or fragments thereof that specifically binds and recognizes an antigen. 
The recognized immunoglobulin genes include the kappa, lambda, alpha, gamma, delta, 
epsilon, and mu constant region genes, as well as the myriad immunoglobulin variable region 
genes. Light chains are classified as either kappa or lambda. Heavy chains are classified as 

25 gamma, mu, alpha, delta, or epsilon, which in turn define the immunoglobulin classes, IgG, 
IgM, IgA, IgD, and IgE, respectively. Typically, the antigen-binding region of an antibody or 
its functional equivalent will be most critical in specificity and affinity of binding. See Paul, 
Fundamental Immunology . 

An exemplary immunoglobulin (antibody) structural unit comprises a tetramer. Each 

30 tetramer is composed of two identical pairs of polypeptide chains, each pair having one 
"light" (about 25 kD) and one "heavy" chain (about 50-70 kD). The N-terminus of each 
chain defines a variable region of about 100 to 110 or more amino acids primarily responsible 
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for antigen recognition. The terms variable light chain (V L ) and variable heavy chain (V H ) 
refer to these light and heavy chains respectively. 

Antibodies exist, e.g., as intact immunoglobulins or as a number of well-characterized 
fragments produced by digestion with various peptidases. Thus, e.g., pepsin digests an 
5 antibody below the disulfide linkages in the hinge region to produce F(ab)' 2( a dimer of Fab 
which itself is a light chain joined to V H -C H 1 by a disulfide bond. The F(ab)' 2 may be 
reduced under mild conditions to break the disulfide linkage in the hinge region, thereby 
converting the F(ab)' 2 dimer into an Fab' monomer. The Fab' monomer is essentially Fab 
with part of the hinge region (see Paul (ed. 1999) Fundamental Immunology (4th ed.). While 

10 various antibody fragments are defined in terms of the digestion of an intact antibody, one of 
skill will appreciate that such fragments may be synthesized de novo either chemically or by 
using recombinant DNA methodology. Thus, the term antibody, as used herein, also includes 
antibody fragments either produced by the modification of whole antibodies, or those 
synthesized de novo using recombinant DNA methodologies (e.g., single chain Fv) or those 

15 identified using phage display libraries (see, e.g., McCafferty, et al. (1990) Nature 348:552- 
554). 

For preparation of antibodies, e.g., recombinant, monoclonal, or polyclonal 
antibodies, many technique known in the art can be used (see, e.g., Kohler and Milstein 
(1975) Nature 256:495-497; Kozbor, et al. (1983) Tmmunologv Today 4:72; Cole, et al. 

20 (1985), pp. 77-96 in Monoclonal Antibodies and Cancer Therapy : Coligan (1991 and 
supplements) Current Protocols in Immunology : Harlow and Lane (1988) Antibodies. A 
Laboratory Manual : and Goding (1986) Monoclonal A ntibodies: Principles and Practice (2d 
ed.)). Techniques for the production of single chain antibodies (U.S. Patent 4,946,778) can 
be adapted to produce antibodies to polypeptides of this invention. Also, transgenic mice, or 

25 other organisms such as other mammals, may be used to express humanized antibodies. 
Alternatively, phage display technology can be used to identify antibodies and heteromeric 
Fab fragments that specifically bind to selected antigens (see, e.g., McCafferty, et al. (1990) 
Nature 348:552-554; Marks, et al. (1992) Biotechnology 10:779-783). 

A "chimeric antibody" is an antibody molecule in which, e.g, (a) the constant region, 

30 or a portion thereof, is altered, replaced, or exchanged so that the antigen binding site 
(variable region) is linked to a constant region of a different or altered class, effector 
function, and/or species, or an entirely different molecule which confers new properties to the 
chimeric antibody, e.g., an enzyme, toxin, hormone, growth factor, drug, etc.; or (b) the 
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variable region, or a portion thereof, is altered, replaced, or exchanged with a variable region 
having a different or altered antigen specificity. 



Identification of lung cancer-associated sequences 

5 In one aspect, the expression levels of genes are determined in different patient 

samples for which diagnosis information is desired, to provide expression profiles. An 
expression profile of a particular sample is essentially a "fingerprint" of the state of the 
sample; while two states may have any particular gene similarly expressed, the evaluation of 
a number of genes simultaneously allows the generation of a gene expression profile that is 

10 characteristic of the state of the cell. That is, normal tissue may be distinguished from 
cancerous or metastatic cancerous tissue, or metastatic cancerous tissue can be compared 
with tissue from surviving cancer patients. By comparing expression profiles of tissue in 
known different lung cancer states, information regarding which genes are important 
(including both up- and down-regulation of genes) in each of these states is obtained. 

15 Molecular profiling may distinguish subtypes of a currently collective disease designation, 
e.g., different forms of lung cancer (chronic disease, adenocarcinoma, etc.) 

The identification of sequences that are differentially expressed in lung cancer versus 
non-lung cancer tissue allows the use of this information in a number of ways. For example, 
a particular treatment regime may be evaluated: does a chemotherapeutic drug act to down- 

20 regulate lung cancer, and thus tumor growth or recurrence, in a particular patient. 

Alternatively, a treatment step may induce other markers which may be used as targets to 
destroy tumor cells. Similarly, diagnosis and treatment outcomes may be done or confirmed 
by comparing patient samples with the known expression profiles. Malignant diseasemay be 
compared to non-malignant conditions. Metastatic tissue can also be analyzed to determine 

25 the stage of lung cancer in the tissue, or origin of primary tumor, e.g., metastasis from a 

remote primary site. Furthermore, these gene expression profiles (or individual genes) allow 
screening of drug candidates with an eye to mimicking or altering a particular expression 
profile; e.g., screening can be done for drugs that suppress the lung cancer expression profile. 
This may be done by making biochips comprising sets of the important lung cancer genes, 

30 which can then be used in these screens. PCR methods may be applied with selected primer 
pairs, and analysis may be of RNA or of genomic sequences. These methods can also be 
done on the protein basis; that is, protein expression levels of the lung cancer proteins can be 
evaluated for diagnostic purposes or to screen candidate agents. In addition, the lung cancer 
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nucleic acid sequences can be administered for gene therapy purposes, including the 
administration of antisense nucleic acids, or the lung cancer proteins (including antibodies 
and other modulators thereof) administered as therapeutic drugs or as protein or DNA 
vaccines. 

5 Thus the present invention provides nucleic acid and protein sequences that are 

differentially expressed in lung cancer relative to normal tissues and/or non-malignant lung 
disease, or in different types of lung disease, herein termed "lung cancer sequences." As 
outlined below, lung cancer sequences include those that are up-regulated (i.e., expressed at a 
higher level) in lung cancer, as well as those that are down-regulated (i.e., expressed at a 

1 0 lower level). In a preferred embodiment, the lung cancer sequences are from humans; 
however, as will be appreciated by those in the art, lung cancer sequences from other 
organisms may be useful in animal models of disease and drug evaluation; thus, other lung 
cancer sequences are provided, from vertebrates, including mammals, including rodents (rats, 
mice, hamsters, guinea pigs, etc.), primates, farm animals (including sheep, goats, pigs, cows, 

15 horses, etc.) and pets (dogs, cats, etc.). Lung cancer sequences from other organisms may be 
obtained using the techniques outlined below. 

Lung cancer sequences can include both nucleic acid and amino acid sequences. As 
will be appreciated by those in the art and is more fully outlined below, lung cancer nucleic 
acid sequences are useful in a variety of applications, including diagnostic applications, 

20 which will detect naturally occurring nucleic acids, as well as screening applications; e.g., 
biochips comprising nucleic acid probes or PCR microtiter plates with selected probes to the 
lung cancer sequences can be generated. 

A lung cancer sequence can be initially identified by substantial nucleic acid and/or 
amino acid sequence homology to the lung cancer sequences outlined herein. Such 

25 homology can be based upon the overall nucleic acid or amino acid sequence, and is 

generally determined as outlined below, e.g., using homology programs or hybridization 
conditions. 

For identifying lung cancer-associated sequences, the lung cancer screen typically 
includes comparing genes identified in different tissues, e.g., normal and cancerous tissues, 
30 cancer and non-malignant conditions, non-malignant conditions and normal tissues, or tumor 
tissue samples from patients who have metastatic disease vs. non metastatic tissue. Other 
suitable tissue comparisons include comparing lung cancer samples with metastatic cancer 
samples from other cancers, such as, breast, other gastrointestinal cancers, prostate, ovarian, 
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etc. Samples of, non metastatic disease tissue and tissue undergoing metastasis are applied to 
biochips comprising nucleic acid probes. The samples are first microdissected, if applicable, 
and treated as is known in the art for the preparation of mRNA. Suitable biochips are 
commercially available, e.g., from Affymetrix, Santa Clara, CA. Gene expression profiles as 
5 described herein are generated and the data analyzed. 

In one embodiment, the genes showing changes in expression as between normal and 
disease states are compared to genes expressed in other normal tissues, preferably normal 
lung, but also including, and not limited to colon, heart, brain, liver, breast, kidney, muscle, 
prostate, small intestine, large intestine, spleen, bone, and/or placenta. In a preferred 
10 embodiment, those genes identified during the lung cancer screen that are expressed in 
significant amounts in other tissues (e.g., essential organs) are removed from the profile, 
although in some embodiments, this is not necessary (e.g., where organs may be dispensible 
at a later stage of life). That is, when screening for drugs, it is usually preferable that the 
target expression be disease specific, to minimize possible side effects on other organs. 
15 In a preferred embodiment, lung cancer sequences are those that are up-regulated in 

lung cancer; that is, the expression of these genes is higher in cancerous tissue than in normal 
lung or other tissue. "Up-regulation" as used herein means, when the ratio is presented as a 
number greater than one, that the ratio is greater than one, preferably 1.5 or greater, more 
preferably 2.0 or greater. Another embodiment is directed to sequences up-regulated in non- 
20 malignant conditions relative to normal. Unigene cluster identification numbers and 

accession numbers herein are for the GenBank sequence database and the sequences of the 
accession numbers are hereby expressly incorporated by reference. GenBank is known in the 
art, see, e.g., Benson, DA, et al (1998) Nucleic Acids Research 26:1-7 and 
http://www.ncbi.nlm.nih.gov/. Sequences are also available in other databases, e.g., 
25 European Molecular Biology Laboratory (EMBL) and DNA Database of Japan (DDBJ). 
Another embodiment is directed to sequences up-regulated in non-malignant conditions 
relative to normal. In some situations, the sequences may be derived from assembly of 
available sequences or be predicted from genomic DNA using exon prediction algorithms, 
such as FGENESH (Salamov and Solovyev (2000) Genome Res. 10:516-522). In other 
30 situations, sequences have been derived from cloning and sequencing of isolated nucleic 
acids. 

In another preferred embodiment, lung cancer sequences are those that are down- 
regulated in the lung cancer; that is, the expression of these genes is lower in cancerous tissue 
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or normal lung or other tissue. "Down-regulation" as used herein means, when the ratio is 
presented as a number greater than one, that the ratio is greater than one, preferably 1.5 or 
greater, more preferably 2.0 or greater, or, when the ratio is presented as a number less than 
one, that the ratio is less than one, preferably 0.5 or less, more preferably 0.25 or less. 

5 

Informatics 

The ability to identify genes that are over or under expressed in lung cancer can 
additionally provide high-resolution, lrigh-sensitivity datasets which can be used in the areas 
of diagnostics, therapeutics, drug development, pharmacogenetics, protein structure, 

10 biosensor development, and other related areas. For example, the expression profiles can be 
used in diagnostic or prognostic evaluation of patients with lung cancer. Or as another 
example, subcellular toxicological information can be generated to better direct drug structure 
and activity correlation (see Anderson (1998) Pharmaceutical Proteomics: Targets. 
Mechanism, and Function, paper presented at the IBC Proteomics conference, Coronado, CA 

15 (June 11-12, 1998)). Subcellular toxicological information can also be utilized in a biological 
sensor device to predict the likely toxicological effect of chemical exposures and likely 
tolerable exposure thresholds (see U.S. Patent No. 5,811,231). Similar advantages accrue 
from datasets relevant to other biomolecules andbioactive agents (e.g., nucleic acids, 
saccharides, lipids, drugs, and the like). 

20 Thus, in another embodiment, the present invention provides a database that includes 

at least one set of assay data. The data contained in the database is acquired, e.g., using array 
analysis either singly or in a library format. The database can be in a form in which data can 
be maintained and transmitted, but is preferably an electronic database. The electronic 
database of the invention can be maintained on any electronic device allowing for the storage 

25 of and access to the database, such as a personal computer, but is preferably distributed on a 
wide area network, such as the World Wide Web. 

The focus of the present section on databases that include peptide sequence data is for 
clarity of illustration only. It will be apparent to those of skill in the art that similar databases 
can be assembled for assay data acquired using an assay of the invention. 

30 The compositions and methods for identifying and/or quantitating the relative and/or 

absolute abundance of a variety of molecular and macromolecular species from a biological 
sample representing lung cancer, i.e., the identification of lung cancer-associated sequences 
described herein, provide an abundance of information, which can be correlated with 
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pathological conditions, predisposition to disease, drug testing, therapeutic monitoring, gene- 
disease causal linkages, identification of correlates of immunity and physiological status, 
among others. Although the data generated from the assays of the invention is suited for 
manual review and analysis, in a preferred embodiment, data processing using high-speed 
5 computers is utilized. 

An array of methods for indexing and retrieving biomolecular information is known 
in the art. For example, U.S. Patents 6,023,659 and 5,966,712 disclose a relational database 
system for storing biomolecular sequence information in a manner that allows sequences to 
be catalogued and searched according to one or more protein function hierarchies. U.S. 

10 Patent 5,953,727 discloses a relational database having sequence records containing 
information in a format that allows a collection of partial-length DNA sequences to be 
catalogued and searched according to association with one or more sequencing projects for 
obtaining full-length sequences from the collection of partial length sequences. U.S. Patent 
5,706,498 discloses a gene database retrieval system for making a retrieval of a gene 

15 sequence similar to a sequence data item in a gene database based on the degree of similarity 
between a key sequence and a target sequence. U.S. Patent 5,538,897 discloses a method 
using mass spectroscopy fragmentation patterns of peptides to identify amino acid sequences 
in computer databases by comparison of predicted mass spectra with experimentally-derived 
mass spectra using a closeness-of-fit measure. U.S. Patent 5,926,818 discloses a multi- 

20 dimensional database comprising a functionality for multi-dimensional data analysis 
described as on-line analytical processing (OLAP), which entails the consolidation of 
projected and actual data according to more than one consolidation path or dimension. U.S. 
Patent 5,295,26 1 reports a hybrid database structure in which the fields of each database 
record are divided into two classes, navigational and informational data, with navigational 

25 fields stored in a hierarchical topological map which can be viewed as a tree structure or as 
the merger of two or more such tree structures. 

See also Mount, et al. (2001) Bioinformatics; Durbin, et al. (eds., 1999) Biological 
Sequence Analysis: Probabilistic Models of Proteins an d Nucleic Acids (; Baxevanis and 
Oeullette (eds., 1998) Bioinformatics: A Practical Guide to the Analysis o f Genes and 

3 0 Proteins); Rashidi and Buehler (1 999) Bioinformatics: Basic App lications in Biological 
Science and Medicine : Setubal, et al. (eds 1997) Introduction to Computational Molecular 
Biology: Misener and Krawetz (eds, 2000) Bioinformatics: Methods and Protocols; Higgins 
and Taylor (eds., 2000) Bioinformatics: Sequence. Structure, and Databanks: A Practical 
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Approach : Brown (2001) Bioinformatics: A Biologist 's Guide to Biocommiting and the 
Internet -. Han and Kamber (2000) Data Mining: Concents and Techniques (2000); and 
Waterman (1995) Introduction to Computational Biology: Maps. Sequences, and Genomes . 
The present invention provides a computer database comprising a computer and 
5 software for storing in computer-retrievable form assay data records cross-tabulated, e.g., 
with data specifying the source of the target-containing sample from which each sequence 
specificity record was obtained. 

In an exemplary embodiment, at least one of the sources of target-containing sample 
is from a control tissue sample known to be free of pathological disorders. In a variation, at 

10 least one of the sources is a known pathological tissue specimen, e.g., a neoplastic lesion or 
another tissue specimen to be analyzed for lung cancer. In another variation, the assay 
records cross-tabulate one or more of the following parameters for each target species in a 
sample: (1) a unique identification code, which can include, e.g., a target molecular structure 
and/or characteristic separation coordinate (e.g., electrophoretic coordinates); (2) sample 

15 source; and (3) absolute and/or relative quantity of the target species present in the sample. 

The invention also provides for the storage and retrieval of a collection of target data 
in a computer data storage apparatus, which can include magnetic disks, optical disks, 
magneto-optical disks, DRAM, SRAM, SGRAM, SDRAM, RDRAM, DDR RAM, magnetic 
bubble memory devices, and other data storage devices, including CPU registers and on-CPU 

20 data storage arrays. Typically, the target data records are stored as a bit pattern in an array of 
magnetic domains on a magnetizable medium or as an array of charge states or transistor gate 
states, such as an array of cells in a DRAM device (e.g., each cell comprised of a transistor 
and a charge storage area, which may be on the transistor). In one embodiment, the invention 
provides such storage devices, and computer systems built therewith, comprising a bit pattern 

25 encoding a protein expression fingerprint record comprising unique identifiers for at least 10 
target data records cross-tabulated with target source. 

When the target is a peptide or nucleic acid, the invention preferably provides a 
method for identifying related peptide or nucleic acid sequences, comprising performing a 
computerized comparison between a peptide or nucleic acid sequence assay record stored in 

30 or retrieved from a computer storage device or database and at least one other sequence. The 
comparison can include a sequence analysis or comparison algorithm or computer program 
embodiment thereof (e.g., FASTA, TFASTA, GAP, BESTFIT) and/or the comparison may 
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be of the relative amount of a peptide or nucleic acid sequence in a pool of sequences 
determined from a polypeptide or nucleic acid sample of a specimen. 

The invention also preferably provides a magnetic disk, such as an EBM-compatible 
(DOS, Windows, Windows95/98/2000, Windows NT, OS/2) or other format (e.g., Linux, 
5 SunOS, Solaris, ATX, SCO Unix, VMS, MV, Macintosh, etc.) floppy diskette or hard (fixed, 
Winchester) disk drive, comprising a bit pattern encoding data from an assay of the invention 
in a file format suitable for retrieval and processing in a computerized sequence analysis, 
comparison, or relative quantitation method. 

The invention also provides a network, comprising a plurality of computing devices 

10 linked via a data link, such as an Ethernet cable (coax or lOBaseT), telephone line, ISDN 

line, wireless network, optical fiber, or other suitable signal transmission medium, whereby at 
least one network device (e.g., computer, disk array, etc.) comprises a pattern of magnetic 
domains (e.g., magnetic disk) and/or charge domains (e.g., an array of DRAM cells) 
composing a bit pattern encoding data acquired from an assay of the invention. 

1 5 The invention also provides a method for transmitting assay data that includes 

generating an electronic signal on an electronic communications device, such as a modem, 
ISDN terminal adapter, DSL, cable modem, ATM switch, or the like, wherein the signal 
includes (in native or encrypted format) a bit pattern encoding data from an assay or a 
database comprising a plurality of assay results obtained by the method of the invention. 

20 In a preferred embodiment, the invention provides a computer system for comparing a 

query target to a database containing an array of data structures, such as an assay result 
obtained by the method of the invention, and ranking database targets based on the degree of 
identity and gap weight to the target data. A central processor is preferably initialized to load 
and execute the computer program for alignment and/or comparison of the assay results. 

25 Data for a query target is entered into the central processor via an I/O device. Execution of 
the computer program results in the central processor retrieving the assay data from the data 
file, which comprises a binary description of an assay result. 

The target data or record and the computer program can be transferred to secondary 
memory, which is typically random access memory (e.g., DRAM, SRAM, SGRAM, or 

30 SDRAM). Targets are ranked according to the degree of correspondence between a selected 
assay characteristic (e.g., binding to a selected affinity moiety) and the same characteristic of 
the query target and results are output via an I/O device. For example, a central processor 
can be a conventional computer (e.g., Intel Pentium, PowerPC, Alpha, PA-8000, SPARC, 
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MIPS 4400, MIPS 10000, VAX, etc.); a program can be a commercial or public domain 
molecular biology software package (e.g., UWGCG Sequence Analysis Software, Darwin); a 
data file can be an optical or magnetic disk, a data server, a memory device (e.g., DRAM, 
SRAM, SGRAM, SDRAM, EPROM, bubble memory, flash memory, etc.); an I/O device can 
5 be a terminal comprising a video display and a keyboard, a modem, an ISDN terminal 

adapter, an Ethernet port, a punched card reader, a magnetic strip reader, or other suitable I/O 
device. 

The invention also preferably provides the use of a computer system, such as that 
described above, which comprises: (1) a computer; (2) a stored bit pattern encoding a 
1 0 collection of peptide sequence specificity records obtained by the methods of the invention, 
which may be stored in the computer; (3) a comparison target, such as a query target; and (4) 
a program for alignment and comparison, typically with rank-ordering of comparison results 
on the basis of computed similarity values. 

1 5 Characteristics of lung cancer-associated proteins 

Lung cancer proteins of the present invention may be classified as secreted proteins, 
transmembrane proteins or intracellular proteins. In one embodiment, the lung cancer protein 
is an intracellular protein. Intracellular proteins may be found in the cytoplasm and/or in the 
nucleus. Intracellular proteins are involved in all aspects of cellular function and replication 

20 (including, e.g., signaling pathways); aberrant expression of such proteins often results in 
unregulated or disregulated cellular processes (see, e.g., Alberts (ed. 1994) Molecular 
Biology of the Cell (3d ed.). For example, many intracellular proteins have enzymatic 
activity such as protein kinase activity, protein phosphatase activity, protease activity, 
nucleotide cyclase activity, polymerase activity and the like. Intracellular proteins also serve 

25 as docking proteins that are involved in organizing complexes of proteins, or targeting 

proteins to various subcellular localizations, and are involved in maintaining the structural 
integrity of organelles. 

An increasingly appreciated concept in characterizing proteins is the presence in the 
proteins of one or more structural motifs for which defined functions have been attributed. In 

3 0 addition to the highly conserved sequences found in the enzymatic domain of proteins, highly 
conserved sequences have been identified in proteins that are involved in protein-protein 
interaction. For example, Src-homology-2 (SH2) domains bind tyrosine-phosphorylated 
targets in a sequence dependent manner. PTB domains, which are distinct from SH2 
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domains, also bind tyrosine phosphorylated targets. SH3 domains bind to proline-rich 
targets. In addition, PH domains, tetratricopeptide repeats and WD domains to name only a 
few, have been shown to mediate protein-protein interactions. Some of these may also be 
involved in binding to phospholipids or other second messengers. As will be appreciated by 
5 one of ordinary skill in the art, these motifs can be identified on the basis of amino acid 
sequence; thus, an analysis of the sequence of proteins may provide insight into both the 
enzymatic potential of the molecule and/or molecules with which the protein may associate. 
One useful database is Pfam (protein families), which is a large collection of multiple 
sequence alignments and hidden Markov models covering many common protein domains. 
1 0 Versions are available via the internet from Washington University in St. Louis, the Sanger 
Center in England, and the Karolinska Institute in Sweden (see, e.g., Bateman, et al (2000) 
Nuc. Acids Res. 28:263-266; Sonnhammer, et al. (1997) Proteins 28:405-420; Bateman, et al. 
(1999) Nuc. Acids Res. 27:260-262; and Sonnhammer, et al. (1998) Nuc. Acids Res. 26:320- 
322). 

1 5 In another embodiment, the lung cancer sequences are transmembrane proteins. 

Transmembrane proteins are molecules that span a phospholipid bilayer of a cell. They may 
have an intracellular domain, an extracellular domain, or both. The intracellular domains of 
such proteins may have a number of functions including those already described for 
intracellular proteins. For example, the intracellular domain may have enzymatic activity 

20 and/or may serve as a binding site for additional proteins. Frequently the intracellular 

domain of transmembrane proteins serves both roles. For example certain receptor tyrosine 
kinases have both protein kinase activity and SH2 domains. In addition, autophosphorylation 
of tyrosines on the receptor molecule itself, creates binding sites for additional SH2 domain 
containing proteins. 

25 Transmembrane proteins may contain from one to many transmembrane domains. 

For example, receptor tyrosine kinases, certain cytokine receptors, receptor guanylyl cyclases 
and receptor serine/threonine protein kinases contain a single transmembrane domain. 
However, various other proteins including channels, pumps, and adenylyl cyclases contain 
numerous transmembrane domains. Many important cell surface receptors such as G protein 

30 coupled receptors (GPCRs) are classified as "seven transmembrane domain" proteins, as they 
contain 7 membrane spanning regions. Characteristics of transmembrane domains include 
approximately 17 consecutive hydrophobic amino acids that may be followed by charged 
amino acids. Therefore, upon analysis of the amino acid sequence of a particular protein, the 
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localization and number of transmembrane domains within the protein may be predicted (see, 
e.g., PSORT web site http://psort.nibb.ac.jp/). 

The extracellular domains of transmembrane proteins are diverse; however, conserved 
motifs are found repeatedly among various extracellular domains. Conserved structure 
5 and/or functions have been ascribed to different extracellular motifs. Many extracellular 
domains are involved in binding to other molecules. In one aspect, extracellular domains are 
found on receptors. Factors that bind the receptor domain include circulating ligands, which 
may be peptides, proteins, or small molecules such as adenosine and the like. For example, 
growth factors such as EGF, FGF, and PDGF are circulating growth factors that bind to their 

10 cognate receptors to initiate a variety of cellular responses. Other factors include cytokines, 
mitogenic factors, hormones, neurotrophic factors and the like. Extracellular domains also 
bind to cell-associated molecules. In this respect, they may mediate cell-cell interactions. 
Cell-associated ligands can be tethered to the cell, e.g., via a glycosylphosphatidylinositol 
(GPI) anchor, or may themselves be transmembrane proteins. Extracellular domains may 

1 5 also associate with the extracellular matrix and contribute to the maintenance of the cell 
structure. 

Lung cancer proteins that are transmembrane are particularly preferred in the present 
invention as they are readily accessible targets for extracellular immunotherapeutics, as are 
described herein. In addition, as outlined below, transmembrane proteins can be also useful 

20 in imaging modalities. Antibodies may be used to label such readily accessible proteins in 
situ or in histological analysis. Alternatively, antibodies can also label intracellular proteins, 
in which case analytical samples are typically permeablized to provide access to intracellular 
proteins. In addition, some membrane proteins can be processed to release a soluble protein, 
or to expose a residual fragment. Released soluble proteins may be useful diagnostic 

25 markers, processed residual protein fragments may be useful lung markers of disease. 

It will also be appreciated by those in the art that a transmembrane protein can be 
made soluble by removing transmembrane sequences, e.g., through recombinant methods. 
Furthermore, transmembrane proteins that have been made soluble can be made to be 
secreted through recombinant means by adding an appropriate signal sequence. 

30 In another embodiment, the lung cancer proteins are secreted proteins; the secretion of 

which can be either constitutive or regulated. These proteins may have a signal peptide or 
signal sequence that targets the molecule to the secretory pathway. Secreted proteins are 
involved in numerous physiological events; e.g., if circulating, they often serve to transmit 
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signals to various other cell types. The secreted protein may function in an autocrine manner 
(acting on the cell that secreted the factor), a paracrine manner (acting on cells in close 
proximity to the cell that secreted the factor), an endocrine manner (acting on cells at a 
distance, e.g., secretion into the blood stream), or exocrine (secretion, e.g., through a duct or 
5 to adjacent epithelial surface as sweat glands, sebaceous glands, pancreatic ducts, lacrimal 
glands, mammary glands, sax producing glands of the ear, etc.). Thus secreted molecules 
often find use in modulating or altering numerous aspects of physiology. Lung cancer 
proteins that are secreted proteins are particularly preferred in the present invention as they 
serve as good targets for diagnostic markers, e.g., for blood, plasma, serum, or stool tests. 
1 0 Those which are enzymes may be antibody or small molecule targets. Others may be useful 
as vaccine targets, e.g., via CTL mechanisms. 

Use of lung cancer nucleic acids 

As described above, lung cancer sequence is initially identified by substantial nucleic 
1 5 acid and/or amino acid sequence homology or linkage to the lung cancer sequences outlined 
herein. Such homology can be based upon the overall nucleic acid or amino acid sequence, 
and is generally determined as outlined below, using either homology programs or 
hybridization conditions. Typically, linked sequences on a mRNA are found on the same 
molecule. 

20 The lung cancer nucleic acid sequences of the invention, e.g., the sequences in Tables 

1A-16, can be fragments of larger genes, i.e., they are nucleic acid segments. "Genes" in this 
context includes coding regions, non-coding regions, and mixtures of coding and non-coding 
regions. Accordingly, as will be appreciated by those in the art, using the sequences provided 
herein, extended sequences, in either direction, of the lung cancer genes can be obtained, 

25 using techniques well known in the art for cloning either longer sequences or the full length 
sequences; see Ausubel, et al., supra. Much can be done by inforaiatics and many sequences 
can be clustered to include multiple sequences corresponding to a single gene, e.g., systems 
such as UniGene (see, http://www.ncbi.nlm.nih.gov/UniGene/). 

Once a lung cancer nucleic acid is identified, it can be cloned and, if necessary, its 

30 constituent parts recombined to form the entire lung cancer nucleic acid coding regions or the 
entire mRNA sequence. Once isolated from its natural source, e.g., contained within a 
plasmid or other vector or excised therefrom as a linear nucleic acid segment, the 
recombinant lung cancer nucleic acid can be further-used as a probe to identify and isolate 
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other lung cancer nucleic acids, e.g., extended coding regions. It can also be used as a 
"precursor" nucleic acid to make modified or variant lung cancer nucleic acids and proteins. 

The lung cancer nucleic acids of the present invention are used in several ways. In a 
first embodiment, nucleic acid probes to the lung cancer nucleic acids are made and attached 
5 to biochips to be used in screening and diagnostic methods, as outlined below, or for 
administration, e.g., for gene therapy, RNAi, vaccine, and/or antisense applications. 
Alternatively, the lung cancer nucleic acids that include coding regions of lung cancer 
proteins can be put into expression vectors for the expression of lung cancer proteins, again 
for screening purposes or for administration to a patient. 

10 In a preferred embodiment, nucleic acid probes to lung cancer nucleic acids (both the 

nucleic acid sequences outlined in the figures and/or the complements thereof) are made. 
The nucleic acid probes attached to the biochip are designed to be substantially 
complementary to the lung cancer nucleic acids, i.e., the target sequence (either the target 
sequence of the sample or to other probe sequences, e.g., in sandwich assays), such that 

1 5 hybridization of the target sequence and the probes of the present invention occurs. As 

outlined below, this complementarity need not be perfect; there may be any number of base 
pair mismatches which will interfere with hybridization between the target sequence and the 
single stranded nucleic acids of the present invention. However, if the number of mutations 
is so great that no hybridization can occur under even the least stringent of hybridization 

20 conditions, the sequence is not a complementary target sequence. Thus, by "substantially 
complementary" herein is meant that the probes are sufficiently complementary to the target 
sequences to hybridize under appropriate reaction conditions, particularly high stringency 
conditions, as outlined herein. 

A nucleic acid probe is generally single stranded but can be partially single and 

25 partially double stranded. The strandedness of the probe is dictated by the structure, 

composition, and properties of the target sequence, hi general, the nucleic acid probes range 
from about 8 to about 100 bases long, with from about 10 to about 80 bases being preferred, 
and from about 30 to about 50 bases being particularly preferred. That is, generally 
complements of ORFs or whole genes are not used. In some embodiments, nucleic acids of 

30 lengths up to hundreds of bases can be used. 

In a preferred embodiment, more than one probe per sequence is used, with either 
overlapping probes or probes to different sections of the target being used. That is, two, 
three, four or more probes, with three being preferred, are used to build in a redundancy for a 
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particular target. The probes ean be overlapping (i.e., have some sequence in common), or 

separate. In some cases, PCR primers may be used to amplify signal for higher sensitivity. 

As will be appreciated by those in the art, nucleic acids can be attached or 

immobilized to a solid support in a wide variety of ways. By "immobilized" and grammatical 

5 equivalents herein is meant the association or binding between the nucleic acid probe and the 

solid support is sufficient to be stable under the conditions of binding, washing, analysis, and 

removal as outlined below. The binding can typically be covalent or non-covalent. By "non- 

covalent binding" and grammatical equivalents herein is typically meant one or more of 

electrostatic, hydrophilic, and hydrophobic interactions. Included in non-covalent binding is 

1 0 the covalent attachment of a molecule, such as, streptavidin to the support and the non- 
covalent binding of the biotinylated probe to the streptavidin. By "covalent binding" and 
grammatical equivalents herein is meant that the two moieties, the solid support and the 
probe, are attached by at least one bond, including sigma bonds, pi bonds and coordination 
bonds. Covalent bonds can be formed directly between the probe and the solid support or can 

15 be formed by a cross linker or by inclusion of a specific reactive group on either the solid 
support or the probe or both molecules. Immobilization may also involve a combination of 
covalent and non-covalent interactions. 

In general, the probes are attached to a biochip in a wide variety of ways, as will be 
appreciated by those in the art. As described herein, the nucleic acids can either be 

20 synthesized first, with subsequent attachment to the biochip, or can be directly synthesized on 
the biochip. 

The biochip comprises a suitable solid substrate. By "substrate" or "solid support" or 
other grammatical equivalents herein is meant a material that can be modified for the 
attachment or association of the nucleic acid probes and is amenable to at least one detection 

25 method. Often the substrate may contain discrete individual sites appropriate for ndivitual 
partitioning and identification. As will be appreciated by those in the art, the number of 
possible substrates are very large, and include, but are not limited to, glass and modified or 
functionalized glass, plastics (including acrylics, polystyrene and copolymers of styrene and 
other materials, polypropylene, polyethylene, polybutylene, polyurethanes, Teflon, etc.), 

30 polysaccharides, nylon or nitrocellulose, resins, silica or silica-based materials including 
silicon and modified silicon, carbon, metals, inorganic glasses, plastics, etc. In general, the 
substrates allow optical detection and do not appreciably fluoresce. A preferred substrate is 
described in US application entitled Reusable Low Fluorescent Plastic Biochip, U.S. 
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Application Serial No. 09/270,214, filed March 15, 1999, herein incorporated by reference in 
its entirety. 

Generally the substrate is planar, although as will be appreciated by those in the art, 
other configurations of substrates may be used as well. For example, the probes may be 
5 placed on the inside surface of a tube, for flow-through sample analysis to minimize sample 
volume. Similarly, the substrate may be flexible, such as a flexible foam, including closed 
cell foams made of particular plastics. 

In a preferred embodiment, the surface of the biochip and the probe may be 
derivatized with chemical functional groups for subsequent attachment of the two. Thus, e.g., 

10 the biochip is derivatized with a chemical functional group including, but not limited to, 
amino groups, carboxy groups, oxo groups and thiol groups, with amino groups being 
particularly preferred. Using these functional groups, the probes can be attached using 
functional groups on the probes. For example, nucleic acids containing amino groups can be 
attached to surfaces comprising amino groups, e.g., using linkers as are known in the art; e.g., 

15 homo-or hetero-bifunctional linkers as are well known (see 1994 Pierce Chemical Company 
catalog, technical section on cross-linkers, pages 155-200). In addition, in some cases, 
additional linkers, such as alkyl groups (including substituted and heteroalkyl groups) maybe 
used. 

In this embodiment, oligonucleotides are synthesized, and then attached to the surface 
20 of the solid support. Either the 5' or 3' terminus may be attached to the solid support, or 

attachment may be via linkage to an internal nucleoside. 

In another embodiment, the immobilization to the solid support may be very strong, 

yet non-covalent. For example, biotinylated oligonucleotides can be made, which bind to 

surfaces covalently coated with streptavidin, resulting in attachment. 
25 Alternatively, the oligonucleotides may be synthesized on the surface, as is known in 

the art. For example, photoactivation techniques utilizing photopolymerization compounds 

and techniques are used. In a preferred embodiment, the nucleic acids can be synthesized in 

situ, using known photolithographic techniques, such as those described in WO 95/251 16; 

WO 95/35505; U.S. Patent Nos. 5,700,637 and 5,445,934; and references cited within, all of 
30 which are expressly incorporated by reference; these methods of attachment form the basis of 

the Affymetrix GeneChip™ technology. 

Often, amplification-based assays are performed to measure the expression level of 

lung cancer-associated sequences. These assays are typically performed in conjunction with 
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reverse transcription. In such assays, a lung cancer-associated nucleic acid sequence acts as a 
template in an amplification reaction (e.g., Polymerase Chain Reaction, or PCR). In a 
quantitative amplification, the amount of amplification product will be proportional to the 
amount of template in the original sample. Comparison to appropriate controls provides a 
5 measure of the amount of lung cancer-associated RNA. Methods of quantitative 

amplification are well known to those of skill in the art. Detailed protocols for quantitative 
PCR are provided, e.g., in Innis, et al. (1990) PCR Protocols. A Guide to Methods and 
Applications . 

In some embodiments, a TaqMan based assay is used to measure expression. 

1 0 TaqMan based assays use a fluorogenic oligonucleotide probe that contains a 5' fluorescent 
dye and a 3' quenching agent. The probe hybridizes to a PCR product, but cannot itself be 
extended due to a blocking agent at the 3 ' end. When the PCR product is amplified in 
subsequent cycles, the 5' nuclease activity of the polymerase, e.g., AmpliTaq, results in the 
cleavage of the TaqMan probe. This cleavage separates the 5' fluorescent dye and the 3' 

15 quenching agent, thereby resulting in an increase in fluorescence as a function of 

amplification (sec, e.g., literature provided by Perkin-Elmer, e.g., www2.perkin-elmer.com). 

Other suitable amplification methods include, but are not limited to, ligase chain 
reaction (LCR) (see Wu and Wallace (1989) Genomics 4:560, Landegren, et al. (1988) 
Science 241:1077, and Barringer, et al. (1990) Gene 89:1 17), transcription amplification 

20 (Kwoh, et al. (1989) Proc. Natl. Acad. Sci. USA 86: 1 1 73), self-sustained sequence 

replication (Guatelli, et al. (1990) Proc. Nat. Acad. Sci. USA 87:1874), dot PCR, and linker 
adapter PCR, etc. 

Expression of lung cancer proteins from nucleic acids 

25 In a preferred embodiment, lung cancer nucleic acids, e.g., encoding lung cancer 

proteins, are used to make a variety of expression vectors to express lung cancer proteins 
which can then be used in screening assays, as described below. Expression vectors and 
recombinant DNA technology are well known to those of skill in the art (see, e.g., Ausubel, 
supra, and Fernandez and Hoeffler (eds 1999) Gene Expression Systems ) and are used to 

30 express proteins. The expression vectors may be either self-replicating extrachromosomal 
vectors or vectors which integrate into a host genome. Generally, these expression vectors 
include transcriptional and translational regulatory nucleic acid operably linked to the nucleic 
acid encoding the lung cancer protein. The term "control sequences" refers to DNA 
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sequences used for the expression of an operably linked coding sequence in a particular host 
organism. Control sequences that are suitable for prokaryotes, e.g., include a promoter, 
optionally an operator sequence, and a ribosome binding site. Eukaryotic cells are known to 
utilize promoters, polyadenylation signals, and enhancers. 
5 Nucleic acid is "operably linked" when it is placed into a functional relationship with 

another nucleic acid sequence. For example, DNA for a presequence or secretory leader is 
operably linked to DNA for a polypeptide if it is expressed as a preprotein that participates in 
the secretion of the polypeptide; a promoter or enhancer is operably linked to a coding 
sequence if it affects the transcription of the sequence; or a ribosome binding site is operably 

10 linked to a coding sequence if it is positioned so as to facilitate translation. Generally, 
"operably linked" means that the DNA sequences being linked are contiguous, and, in the 
case of a secretory leader, contiguous and in reading phase. However, enhancers do not have 
to be contiguous. Linking is typically accomplished by ligation at convenient restriction 
sites. If such sites do not exist, synthetic oligonucleotide adaptors or linkers are used in 

1 5 accordance with conventional practice. Transcriptional and translational regulatory nucleic 
acid will generally be appropriate to the host cell used to express the lung cancer protein. 
Numerous types of appropriate expression vectors, and suitable regulatory sequences are 
known in the art for a variety of host cells. 

In general, transcriptional and translational regulatory sequences may include, but are 

20 not limited to, promoter sequences, ribosomal binding sites, transcriptional start and stop 
sequences, translational start and stop sequences, and enhancer or activator sequences. In a 
preferred embodiment, the regulatory sequences include a promoter and transcriptional start 
and stop sequences. 

Promoter sequences may be either constitutive or inducible promoters. The promoters 
25 may be either naturally occurring promoters or hybrid promoters. Hybrid promoters, which 
combine elements of more than one promoter, are also known in the art, and are useful in the 
present invention. 

In addition, an expression vector may comprise additional elements. For example, the 
expression vector may have two replication systems, thus allowing it to be maintained in two 
30 organisms, e.g., in mammalian or insect cells for expression and in a prokaryotic host for 
cloning and amplification. Furthermore, for integrating expression vectors, the expression 
vector often contains at least one sequence homologous to the host cell genome, and 
preferably two homologous sequences which flank the expression construct. The integrating 
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vector may be directed to a specific locus in the host cell by selecting the appropriate 
homologous sequence for inclusion in the vector. Constructs for integrating vectors are well 
known in the art (e.g., Fernandez and Hoeffier, supra). 

In addition, in a preferred embodiment, the expression vector contains a selectable 
5 marker gene to allow the selection of transformed host cells. Selection genes are well known 
in the art and will vary with the host cell used. 

The lung cancer proteins of the present invention are usually produced by culturing a 
host cell transformed with an expression vector containing nucleic acid encoding a lung 
cancer protein, under the appropriate conditions to induce or cause expression of the lung 

10 cancer protein. Conditions appropriate for lung cancer protein expression will vary with the 
choice of the expression vector and the host cell, and will be easily ascertained by one skilled 
in the art through routine experimentation or optimization. For example, the use of 
constitutive promoters in the expression vector will require optimizing the growth and 
proliferation of the host cell, while the use of an inducible promoter requires the appropriate 

15 growth conditions for induction. In addition, in some embodiments, the timing of the harvest 
is important. For example, the baculoviral systems used in insect cell expression are lytic 
viruses, and thus harvest time selection can be crucial for product yield. 

Appropriate host cells include yeast, bacteria, archaebacteria, fungi, and insect and 
animal cells, including mammalian cells. Of particular interest are Saccharomyces cerevisiae 

20 and other yeasts, E. coli, Bacillus subtilis, Sf9 cells, CI 29 cells, 293 cells, Neurospora, BHK, 
CHO, COS, HeLa cells, HUVEC (human umbilical vein endothelial cells), THP1 cells (a 
macrophage cell line) and various other human cells and cell lines. 

In a preferred embodiment, the lung cancer proteins are expressed in mammalian 
cells. Mammalian expression systems are also known in the art, and include retroviral and 

25 adenoviral systems. Of particular use as mammalian promoters are the promoters from 
mammalian viral genes, since the viral genes are often highly expressed and have a broad 
host range. Examples include the SV40 early promoter, mouse mammary tumor virus LTR 
promoter, adenovirus major late promoter, herpes simplex virus promoter, and the CMV 
promoter (see, e.g., Fernandez and Hoeffier, supra). Typically, transcription termination and 

30 polyadenylation sequences recognized by mammalian cells are regulatory regions located 3' 
to the translation stop codon and thus, together with the promoter elements, flank the coding 
sequence. Examples of transcription terminator and polyadenylation signals include those 
derived form SV40. 
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The methods of introducing exogenous nucleic acid into mammalian hosts, as well as 
other hosts, is well known in the art, and will vary with the host cell used. Techniques 
include dextran-mediated transfection, calcium phosphate precipitation, polybrene mediated 
transfection, protoplast fusion, electroporation, viral infection, encapsulation of the 
5 polynucleotide(s) in liposomes, and direct microinjection of the DNA into nuclei. 

In a preferred embodiment, lung cancer proteins are expressed in bacterial systems. 
Promoters from bacteriophage may also be used and are known in the art. In addition, 
synthetic promoters and hybrid promoters are also useful; e.g., the tac promoter is a hybrid of 
the trp and lac promoter sequences. Furthermore, a bacterial promoter can include naturally 

1 0 occurring promoters of non-bacterial origin that have the ability to bind bacterial RNA 
polymerase and initiate transcription. In addition to a functioning promoter sequence, an 
efficient ribosome binding site is desirable. The expression vector may also include a signal 
peptide sequence that provides for secretion of the lung cancer protein in bacteria. The 
protein is either secreted into the growth media (gram-positive bacteria) or into the 

1 5 periplasmic space, located between the inner and outer membrane of the cell (gram-negative 
bacteria). The bacterial expression vector may also include a selectable marker gene to allow 
for the selection of bacterial strains that have been transformed. Suitable selection genes 
include genes which render the bacteria resistant to drugs such as ampicillin, 
chloramphenicol, erythromycin, kanamycin, neomycin and tetracycline. Selectable markers 

10 also include biosynthetic genes, such as those in the histidine, tryptophan and leucine 

biosynthetic pathways. These components are assembled into expression vectors. Expression 
vectors for bacteria are well known in the art, and include vectors for Bacillus subtilis, E. 
coli, Streptococcus cremoris, and Streptococcus lividans, among others (e.g., Fernandez and 
Hoeffler, supra). The bacterial expression vectors are transformed into bacterial host cells 

15 using techniques well known in the art, such as calcium chloride treatment, electroporation, 
and others. 

In one embodiment, lung cancer proteins are produced in insect cells. Expression 
vectors for the transformation of insect cells, and in particular, baculovirus-based expression 
vectors, are well known in the art. 
50 In a preferred embodiment, lung cancer protein is produced in yeast cells. Yeast 

expression systems are well known in the art, and include expression vectors for 
Saccharomyces cerevisiae, Candida albicans and C. maltosa, Hansenula polymorpha, 
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Kluyveromyces fragilis and K. lactis, Pichia guillerimondii, and P. pastoris, 

Schizosaccharomyces pombe, and Yarrowia lipolytica. 

The lung cancer protein may also be made as a fusion protein, using techniques well 

known in the art. Thus, e.g., for the creation of monoclonal antibodies, if the desired epitope 

5 is small, the lung cancer protein may be fused to a carrier protein to form an immunogen. 

Alternatively, the lung cancer protein may be made as a fusion protein to increase expression 

for affinity purification purposes, or for other reasons. For example, when the lung cancer 

protein is a lung cancer peptide, the nucleic acid encoding the peptide may be linked to other 

nucleic acid for expression purposes. 

10 In a preferred embodiment, the lung cancer protein is purified or isolated after 

expression. Lung cancer proteins may be isolated or purified in a variety of appropriate 
ways. Standard purification methods include electrophoretic, molecular, immunological and 
chromatographic techniques, including ion exchange, hydrophobic, affinity, and reverse- 
phase HPLC chromatography, and chromatofocusing. For example, the lung cancer protein 

1 5 may be purified using a standard anti-lung cancer protein antibody column. Ultrafiltration 
and diafiltration techniques, in conjunction with protein concentration, are also useful. For 
general guidance in suitable purification techniques, see Scopes (1982) Protein Purification . 
The degree of purification necessary will vary depending on the use of the lung cancer 
protein. In some instances no purification will be necessary. 

20 Once expressed and purified if necessary, the lung cancer proteins and nucleic acids 

are useful in a number of applications. They may be used as immunoselection reagents, as 
vaccine reagents, as screening agents, therapeutic entities, for production of antibodies, as 
transcription or translation inhibitors, etc. 

25 Variants of lung cancer proteins 

In one embodiment, the lung cancer proteins are derivative or variant lung cancer 
proteins as compared to the wild-type sequence. That is, as outlined more fully below, the 
derivative lung cancer peptide will often contain at least one amino acid substitution, deletion 
or insertion, with amino acid substitutions being particularly preferred. The amino acid 
30 substitution, insertion or deletion may occur at a particular residue within the lung cancer 
peptide. 

Also included within one embodiment of lung cancer proteins of the present invention 
are amino acid sequence variants. These variants typically fall into one or more of three 
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classes: substitutional, insertional or deletional variants. These variants ordinarily are 
prepared by site specific mutagenesis of nucleotides in the DNA encoding the lung cancer 
protein, using cassette or PCR mutagenesis or other techniques, to produce DNA encoding 
the variant, and thereafter expressing the DNA in recombinant cell culture as outlined above. 
5 However, variant lung cancer protein fragments having up to about 100-150 residues may be 
prepared by in vitro synthesis. Amino acid sequence variants are characterized by the 
predetermined nature of the variation, a feature that sets them apart from naturally occurring 
allelic or interspecies variation of the lung cancer protein amino acid sequence. The variants 
typically exhibit a similar qualitative biological activity as the naturally occurring analogue, 

1 0 although variants can also be selected which have modified characteristics as will be more 
fully outlined below. 

While the site or region for introducing an amino acid sequence variation is often 
predetermined, the mutation per se need not be predetermined. For example, in order to 
optimize the performance of a mutation at a given site, random mutagenesis may be 

1 5 conducted at the target codon or region and the expressed lung cancer variants screened for 
the optimal combination of desired activity. Techniques exist for making substitution 
mutations at predetermined sites in DNA having a known sequence, e.g., Ml 3 primer 
mutagenesis and PCR mutagenesis. Screening of mutants is often done using assays of lung 
cancer protein activities. 

20 Amino acid substitutions are typically of single residues; insertions usually will be on 

the order of from about 1 to 20 amino acids, although considerably larger insertions may be 
occasionally tolerated. Deletions generally range from about 1 to about 20 residues, although 
in some cases deletions may be much larger. 

Substitutions, deletions, insertions or any combination thereof may be used to arrive 

25 at a final derivative. Generally these changes are done on a few amino acids to minimize the 
alteration of the molecule. Larger changes may be tolerated in certain circumstances. When 
small alterations in the characteristics of a lung cancer protein are desired, substitutions are 
generally made in accordance with the amino acid substitution chart provided in the 
definition section. 

30 Variants typically exhibit essentially the same qualitative biological activity and will 

elicit the same immune response as a naturally-occurring analog, although variants also are 
selected to modify the characteristics of lung cancer proteins as needed. Alternatively, the 
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variant may be designed or reorganized such that a biological activity of the lung cancer 
protein is altered. For example, glycosylation sites may be added, altered, or removed. 

Covalent modifications of lung cancer polypeptides are included within the scope of 
this invention. One type of covalent modification includes reacting targeted amino acid 
5 residues of a lung cancer polypeptide with an organic derivatizing agent that is capable of 
reacting with selected side chains or the N-or C-terminal residues of a lung cancer 
polypeptide. Derivatization with bifunctional agents is useful, for instance, for crosslinking 
lung cancer polypeptides to a water-insoluble support matrix or surface for use in a method 
for purifying anti-lung cancer polypeptide antibodies or screening assays, as is more fully 

10 described below. Commonly used crosslinking agents include, e.g., l,l-bis(diazoacetyl)-2- 
phenylethane, glutaraldehyde, N-hydroxysuccinimide esters, e.g., esters with 4-azidosalicylic 
acid, homobifunctional imidoesters, including disuccinimidyl esters such as 3,3'- 
dithiobis(succinimidylpropionate), bifunctional maleimides such as bis-N-maleimido-1,8- 
octane and agents such as methyl-3-((p-azidophenyl)dithio)propipimidate. 

15 Other modifications include deamidation of glutaminyl and asparaginyl residues to 

the corresponding glutamyl and aspartyl residues, respectively, hydroxylation of proline and 
lysine, phosphorylation of hydroxyl groups of serinyl, threonyl or tyrosyl residues, 
methylation of the y-amino groups of lysine, arginine, and histidine side chains (Crcighton 
(1983) Proteins: Structure and Molecular Properties , pp. 79-86), acetylation of the N-terminal 

20 amine, and amidation of any C-terminal carboxyl group. 

Another type of covalent modification of the lung cancer polypeptide encompassed by 
this invention is an altered native glycosylation pattern of the polypeptide. "Altering the 
native glycosylation pattern" is intended herein to mean adding to or deleting one or more 
carbohydrate moieties of a native sequence lung cancer polypeptide. Glycosylation patterns 

25 can be altered in many ways. For example the use of different cell types to express lung 
cancer-associated sequences can result in different glycosylation patterns. 

Addition of glycosylation sites to lung cancer polypeptides may also be accomplished 
by altering the amino acid sequence thereof. The alteration may be made, e.g., by the 
addition of, or substitution by, one or more serine or threonine residues to the native sequence 

30 lung cancer polypeptide (for O-linked glycosylation sites). The lung cancer amino acid 
sequence may optionally be altered through changes at the DNA level, particularly by 
mutating the DNA encoding the lung cancer polypeptide at preselected bases such that 
codons are generated that will translate into the desired amino acids. 
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Another means of increasing the number of carbohydrate moieties on the lung cancer 
polypeptide is by chemical or enzymatic coupling of glycosides to the polypeptide. Such 
methods are described in the art, e.g., in WO 87/05330, and in Aplin and Wriston (1981) 
CRC Crit. Rev. Biochem , pp. 259-306. 
5 Removal of carbohydrate moieties present on the lung cancer polypeptide may be 

accomplished chemically or enzymatically or by mutational substitution of codons encoding 
for amino acid residues that serve as targets for glycosylation. Chemical deglycosylation 
techniques are known in the art and described, for instance, by Hakimuddin, et al. (1987) 
Arch. Biochem. Biophvs.. 259:52 and by Edge, et al. (1981) Anal. Biochem. . 118:131. 
1 0 Enzymatic cleavage of carbohydrate moieties on polypeptides can be achieved by the use of a 
variety of endo-and exo-glycosidases as described by Thotakura, et al. (1987) Meth. 
EnzvmoL 138:350. 

Another type of covalent modification of lung cancer comprises linking the lung 
cancer polypeptide to one of a variety of nonproteinaceous polymers, e.g., polyethylene 

1 5 glycol, polypropylene glycol, or polyoxyalkylenes, in the manner set forth in U.S. Patent 
Nos. 4,640,835; 4,496,689; 4,301,144; 4,670,417; 4,791,192, or 4,179,337. 

Lung cancer polypeptides of the present invention may also be modified in a way to 
form chimeric molecules comprising a lung cancer polypeptide fused to another, 
heterologous polypeptide or amino acid sequence. In one embodiment, such a chimeric 

20 molecule comprises a fusion of a lung cancer polypeptide with a tag polypeptide which 
provides an epitope to which an anti-tag antibody can selectively bind. The epitope tag is 
generally placed at the amino-or carboxyl-terminus of the lung cancer polypeptide. The 
presence of such epitope-tagged forms of a lung cancer polypeptide can be detected using an 
antibody against the tag polypeptide. Also, provision of the epitope tag enables the lung 

25 cancer polypeptide to be readily purified by affinity purification using an anti-tag antibody or 
another type of affinity matrix that binds to the epitope tag. In an alternative embodiment, 
the chimeric molecule may comprise a fusion of a lung cancer polypeptide with an 
immunoglobulin or a particular region of an immunoglobulin. For a bivalent form of the 
chimeric molecule, such a fusion could be to the Fc region of an IgG molecule. 

30 Various tag polypeptides and their respective antibodies are well known and examples 

include poly-histidine (poly-his) or poly-histidine-glycine (poly-his-gly) tags; HIS6 and metal 
chelation tags, the flu HA tag polypeptide and its antibody 12CA5 (Field, et al. (1988) MoL 
Cell. Biol. 8:2159-2165); the c-myc tag and the 8F9, 3C7, 6E10, G4, B7 and 9E10 antibodies 
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thereto (Evan, et al. (1985) Molecular and Cellular Biology 5:3610-3616); and the Herpes 
Simplex virus glycoprotein D (gD) tag and its antibody (Paborsky, et al. (1990) Protein 
Engineering 3(6):547-553). Other tag polypeptides include the Flag-peptide (Hopp, et al. 
(1988) BioTechnology 6:1204-1210); the KT3 epitope peptide (Martin, et al. (1992) Science 
5 255:192-194); tubulin epitope peptide (Skinner, et al. (1991) J. Biol. Chem. 266: 15163- 
15166); and the T7 gene 10 protein peptide tag (Lutz-Freyermuth, et al. (1990) Proc. Nat'l 
Acad. Sci. USA 87:6393-6397). 

Also included are other lung cancer proteins of the lung cancer family, and lung 
cancer proteins from other organisms, which are cloned and expressed as outlined below. 

10 Thus, probe or degenerate polymerase chain reaction (PCR) primer sequences may be used to 
find other related lung cancer proteins from primates or other organisms. As will be 
appreciated by those in the art, particularly useful probe and/or PCR primer sequences 
include unique areas of the lung cancer nucleic acid sequence. As is generally known in the 
art, preferred PCR primers are from about 15 to about 35 nucleotides in length, with from 

15 about 20 to about 30 being preferred, and may contain inosine as needed. PCR reaction 
conditions are well known in the art (e.g., Innis, PCR Protocols, supra). 

Antibodies to lung cancer proteins 

In a preferred embodiment, when a lung cancer protein is to be used to generate 
20 antibodies, e.g., for immunotherapy or immunodiagnosis, the lung cancer protein should 
share at least one epitope or determinant with the full length protein. By "epitope" or 
"determinant" herein is typically meant a portion of a protein which will generate and/or bind 
an antibody or T-cell receptor in the context of MHC. Thus, in most instances, antibodies 
made to a smaller lung cancer protein will be able to bind to the full-length protein, 
25 particularly linear epitopes. In a preferred embodiment, the epitope is unique; that is, 
antibodies generated to a unique epitope show little or no cross-reactivity. 

Methods of preparing polyclonal antibodies are well known (e.g., Coligan, supra; and 
Harlow and Lane, supra). Polyclonal antibodies can be raised in a mammal, e.g., by one or 
more injections of an immunizing agent and, if desired, an adjuvant. Typically, the 
30 immunizing agent and/or adjuvant will be injected in the mammal by multiple subcutaneous 
or intraperitoneal injections. The immunizing agent may include a protein encoded by a 
nucleic acid of Tables 1A-16 or fragment thereof or a fusion protein thereof. It may be useful 
to conjugate the immunizing agent to a protein known to be immunogenic in the mammal 
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being immunized. Immunogenic proteins include, e.g., keyhole limpet hemocyanin, serum 
albumin, bovine thyroglobulin, and soybean trypsin inhibitor. Adjuvants include, e.g., 
Freund's complete adjuvant and MPL-TDM adjuvant (monophosphoryl Lipid A, synthetic 
trehalose dicorynomycolate). The immunization protocol may be selected by one skilled in 
5 the art. 

The antibodies may, alternatively, be monoclonal antibodies. Monoclonal antibodies 
may be prepared using hybridoma methods, such as those described by Kohler and Milstein 
(1975) Nature 256:495. In a hybridoma method, a mouse, hamster, or other appropriate host 
animal, is typically immunized with an immunizing agent to elicit lymphocytes that produce 

10 or are capable of producing antibodies that will specifically bind to the immunizing agent. 
Alternatively, the lymphocytes may be immunized in vitro. The immunizing agent will 
typically include a polypeptide encoded by a nucleic acid of the tables, or fragment thereof, 
or a fusion protein thereof. Generally, either peripheral blood lymphocytes ("PBLs") are 
used if cells of human origin are desired, or spleen cells or lymph node cells are used if non- 

1 5 human mammalian sources are desired. The lymphocytes are then fused with an 

immortalized cell line using a suitable fusing agent, such as polyethylene glycol, to form a 
hybridoma cell (Goding (1986) Monoclonal Antibodies: Principles and Practice, pp. 59-103 ). 
Immortalized cell lines are usually transformed mammalian cells, particularly myeloma cells 
of rodent, bovin, or primate origin. Usually, rat or mouse myeloma cell lines are employed. 

20 The hybridoma cells may be cultured in a suitable culture medium that preferably contains 
one or more substances that inhibit the growth or survival of the unfused, immortalized cells. 
For example, if the parental cells lack the enzyme hypoxanthine guanine phosphoribosyl 
transferase (HGPRT or HPRT), the culture medium for the hybridomas typically will include 
hypoxanthine, aminopterin, and thymidine ("HAT medium"), which substances prevent the 

25 growth of HGPRT-deficient cells. 

In one embodiment, the antibodies are bispecific antibodies. Bispecific antibodies are 
typically monoclonal, preferably human or humanized, antibodies that have binding 
specificities for at least two different antigens or that have binding specificities for two 
epitopes on the same antigen. In one embodiment, one of the binding specificities is for a 

30 protein encoded by a nucleic acid of the tables or a fragment thereof, the other one is for any 
other antigen, and preferably for a cell-surface protein or receptor or receptor subunit, 
preferably one that is tumor specific. Alternatively, tetramer-type technology may create 
multivalent reagents. 
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In a preferred embodiment, the antibodies to lung cancer protein are capable of 
reducing or eliminating a biological function of a lung cancer protein, in a naked form or 
conjugated to an effector moiety. That is, the addition of anti-lung cancer protein antibodies 
(either polyclonal or preferably monoclonal) to lung cancer tissue (or cells containing lung 
5 cancer) may reduce or eliminate the lung cancer. Generally, at least a 25% decrease in 
activity, growth, size or the like is preferred, with at least about 50% being particularly 
preferred and about a 95-100% decrease being especially preferred. 

In a preferred embodiment the antibodies to the lung cancer proteins are humanized 
antibodies (e.g., Xenerex Biosciences, Medarex, Inc., Abgenix, Inc., Protein Design Labs, 

10 Inc.) Humanized forms of non-human (e.g., murine) antibodies are chimeric molecules of 
immunoglobulins, immunoglobulin chains or fragments thereof (such as Fv, Fab, Fab', 
F(ab')2 or other antigen-binding subsequences of antibodies) which contain minimal 
sequence derived from non-human immunoglobulin. Humanized antibodies include human 
immunoglobulins (recipient antibody) in which residues from a complementary determining 

15 region (CDR) of the recipient are replaced by residues from a CDR of a non-human species 
(donor antibody) such as mouse, rat or rabbit having the desired specificity, affinity and 
capacity. In some instances, Fv framework residues of a human immunoglobulin are 
replaced by corresponding non-human residues. Humanized antibodies may also comprise 
residues which are found neither in the recipient antibody nor in the imported CDR or 

20 framework sequences. In general, a humanized antibody will comprise substantially all of at 
least one, and typically two, variable domains, in which all or substantially all of the CDR 
regions correspond to those of a non-human immunoglobulin and all or substantially all of 
the framework (FR) regions are those of a human immunoglobulin consensus sequence. A 
humanized antibody optimally also will typically comprise at least a portion of an 

25 immunoglobulin constant region (Fc), typically that of a human immunoglobulin (Jones, et 
al. (1986) Nature 321 :522-525; Riechmann, et al. (1988) Nature 332:323-329; and Presta . 
(1992) Curr. Op. Struct. Biol. 2:593-596). Humanization can be performed following the 
method of Winter and co-workers (Jones, et al. (1986) Nature 321:522-525; Riechmann, et al. 
(1988) Nature 332:323-327; Verhoeyen, et al. (1988) Science 239:1534-1536), by 

30 substituting rodent CDRs or CDR sequences for corresponding sequences of a human 

antibody. Accordingly, such humanized antibodies are chimeric antibodies (U.S. Patent No. 
4,816,567), wherein substantially less than an intact human variable domain has been 
substituted by corresponding sequence from a non-human species. 
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Human-like antibodies can also be produced using various techniques known in the 
art, including phage display libraries (Hoogenboom and Winter (1991) J. Mol. Bio l. 227:381; 
Marks, et al. (1991) J. Mol. Biol. 222:581). The techniques of Cole, et al. and Boerner, et al. 
are also available for the preparation of human monoclonal antibodies (Cole, et al. (1985) 
5 Monoclonal Antibodies and Cancer Therapy, p. 77 and Boerner, et al. (1991) J. Immunol. 
147(l):86-95). Similarly, human antibodies can be made by introducing human 
immunoglobulin loci into transgenic animals, e.g., mice in which the endogenous 
immunoglobulin genes have been partially or completely inactivated. Upon challenge, 
human antibody production is observed, which closely resembles that seen in humans in 

10 nearly all respects, including gene rearrangement, assembly, and antibody repertoire. This 
approach is described, e.g., in U.S. Patent Nos. 5,545,807; 5,545,806; 5,569,825; 5,625,126; 
5,633,425; 5,661,016, and in the following scientific publications: Marks, et al. (1992) 
Bio/Technology 10:779-783; Lonberg, et al. (1994) Nature 368:856-859; Morrison (1994) 
Nature 368:812-13; Fishwild, et al. (1996) Nature Biotechnology 14:845-51; Neuberger 

15 (1996) Nature Biotechnology 14:826; and Lonberg and Huszar (1995) Tntern. Rev. Immunol. 
13:65-93. 

By immunotherapy is meant treatment of lung cancer with an antibody raised against 
a lung cancer proteins. As used herein, immunotherapy can be passive or active. Passive 
immunotherapy as defined herein is the passive transfer of antibody to a recipient (patient). 

20 Active immunization is the induction of antibody and/or T-cell responses in a recipient 
(patient). Induction of an immune response is the result of providing the recipient with an 
antigen to which antibodies are raised. The antigen may be provided by injecting a 
polypeptide against which antibodies are desired to be raised into a recipient, or contacting 
the recipient with a nucleic acid capable of expressing the antigen and under conditions for 

25 expression of the antigen, leading to an immune response. 

In a preferred embodiment the lung cancer proteins against which antibodies are 
raised are secreted proteins as described above. Without being bound by theory, antibodies 
used for treatment, may bind and prevent the secreted protein from binding to its receptor, 
thereby inactivating the secreted lung cancer protein. 

30 In another preferred embodiment, the lung cancer protein to which antibodies are 

raised is a transmembrane protein. Without being bound by theory, antibodies used for 
treatment may bind the extracellular domain of the lung cancer protein and prevent it from 
binding to other proteins, such as circulating ligands or cell-associated molecules. The 
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antibody may cause down-regulation of the transmembrane lung cancer protein. The 
antibody may be a competitive, non-competitive or uncompetitive inhibitor of protein binding 
to the extracellular domain of the lung cancer protein. The antibody may be an antagonist of 
the lung cancer protein or may prevent activation of a transmembrane lung cancer protein, or 
5 may induce or suppress a particular cellular pathway. In some embodiments, when the 
antibody prevents the binding of other molecules to the lung cancer protein, the antibody 
prevents growth of the cell. The antibody may also be used to target or sensitize the cell to 
cytotoxic agents, including, but not limited to TNF-oc, TNF-p, IL-1, INF-y, and IL-2, or 
chemotherapeutic agents including 5FU, vinblastine, actinomycin D, cisplatin, methotrexate, 

10 and the like. In some instances the antibody may belong to a sub-type that activates serum 
complement when complexed with the transmembrane protein thereby mediating cytotoxicity 
or antigen-dependent cytotoxicity (ADCC). Thus, lung cancer may be treated by 
administering to a patient antibodies directed against the transmembrane lung cancer protein. 
Antibody-labeling may activate a co-toxin, localize a toxin payload, or otherwise provide 

15 means to locally ablate cells. 

In another preferred embodiment, the antibody is conjugated to an effector moiety. 
The effector moiety can be various molecules, including labeling moieties such as radioactive 
labels or fluorescent labels, or can be a therapeutic moiety. In one aspect the therapeutic 
moiety is a small molecule that modulates the activity of a lung cancer protein. In another 

20 aspect the therapeutic moiety may modulate an activity of molecules associated with or in 
close proximity to a lung cancer protein. The therapeutic moiety may inhibit enzymatic or 
signaling activity such as protease or collagenase activity associated with lung cancer. 

In a preferred embodiment, the therapeutic moiety can also be a cytotoxic agent. In 
this method, targeting the cytotoxic agent to lung cancer tissue or cells results in a reduction 

25 in the number of afflicted cells, thereby reducing symptoms associated with lung cancer. 

Cytotoxic agents are numerous and varied and include, but are not limited to, cytotoxic drugs 
or toxins or active fragments of such toxins. Suitable toxins and their corresponding 
fragments include diphtheria A chain, exotoxin A chain, ricin A chain, abrin A chain, curcin, 
crotin, phenomycin, enomycin, saporin, auristatin, and the like. Cytotoxic agents also include 

30 radiochemicals made by conjugating radioisotopes to antibodies raised against lung cancer 
proteins, or binding of a radionuclide to a chelating agent that has been covalently attached to. 
the antibody. Targeting the therapeutic moiety to transmembrane lung cancer proteins not 
only serves to increase the local concentration of therapeutic moiety in the lung cancer 
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afflicted area, but also serves to reduce deleterious side effects that may be associated with 
the untargeted therapeutic moiety. 

In another preferred embodiment, the lung cancer protein against which the antibodies 
are raised is an intracellular protein. In this case, the antibody may be conjugated to a protein 
5 or other entity which facilitates entry into the cell. In one case, the antibody enters the cell by 
endocytosis. In another embodiment, a nucleic acid encoding the antibody is administered to 
the individual or cell. Moreover, wherein the lung cancer protein can be targeted within a 
cell, i.e., the nucleus, an antibody theretomay contain a signal for that target localization, i.e., 
a nuclear localization signal. 
1 0 The lung cancer antibodies of the invention specifically bind to lung cancer proteins. 

By "specifically bind" herein is meant that the antibodies bind to the protein with a of at 
least about 0.1 mM, more usually at least about 1 uM, preferably at least about 0.1 uM or 
better, and most preferably, 0.01 uM or better. Selectivity of binding to the specific target 
and not to related other sequences is also important. 

15 

Detection of lung cancer sequence for diagnostic and therapeutic applications 

In one aspect, the RNA expression levels of genes are determined for different 
cellular states in the lung cancer phenotype. Expression levels of genes in normal tissue (e.g., 
not undergoing lung cancer), in lung cancer tissue (and in some cases, for varying severities 

20 of lung cancer that relate to prognosis, as outlined below), or in non-malignant disease are 
evaluated to provide expression profiles. A gene expression profile of a particular cell state 
or point of development is essentially a "fingerprint" of the state of the cell. While two states 
may have a particular gene similarly expressed, the evaluation of a number of genes 
simultaneously allows the generation of a gene expression profile that is reflective of the state 

25 of the cell. By comparing expression profiles of cells in different states, information 

regarding which genes are important (including both up- and down-regulation of genes) in 
each of these states is obtained. Then, diagnosis may be performed or confirmed to 
determine whether a tissue sample has the gene expression profile of normal or cancerous 
tissue. This will provide for molecular diagnosis of related conditions. 

30 "Differential expression," or grammatical equivalents as used herein, refers to 

qualitative or quantitative differences in the temporal and/or cellular gene expression 
patterns within and among cells and tissue. Thus, a differentially expressed gene can 
qualitatively have its expression altered, including an activation or inactivation, in, e.g., 
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normal versus lung cancer tissue. Genes may be turned on or turned off in a particular state, 
relative to another state thus permitting comparison of two or more states. A qualitatively 
regulated gene will exhibit an expression partem within a state or cell type which is 
detectable by standard techniques. Some genes will be expressed in one state or cell type, but 
5 not in both. Alternatively, the difference in expression may be quantitative, e.g., in that 
expression is increased or decreased; i.e., gene expression is either upregulated, resulting in 
an increased amount of transcript, or downregulated, resulting in a decreased amount of 
transcript. The degree to which expression differs need only be large enough to quantify via 
standard characterization techniques as outlined below, such as by use of Affymetrix 

10 GeneChip™ expression arrays, Lockhart (1996) Nature Biotechnology 14: 1675-1680, hereby 
expressly incorporated by reference. Other techniques include, but are not limited to, 
quantitative reverse transcriptase PCR, northern analysis and RNase protection. As outlined 
above, preferably the change in expression (i.e., upregulation or downregulation) is typically 
at least about 50%, more preferably at least about 100%, more preferably at least about 

15 150%, more preferably at least about 200%, with from 300 to at least 1000% being especially 
preferred. 

Evaluation may be at the gene transcript or the protein level. The amount of gene 
expression may be monitored using nucleic acid probes to the RNA or DNA equivalent of the 
gene transcript, and the quantification of gene expression levels, or, alternatively, the final 

20 gene product itself (protein) can be monitored, e.g., with antibodies to the lung cancer protein 
and standard immunoassays (ELISAs, etc.) or other techniques, including mass spectroscopy 
assays, 2D gel electrophoresis assays, etc. Proteins corresponding to lung cancer genes, e.g., 
those identified as being important in a lung cancer or disease phenotype, can be evaluated in 
a lung cancer diagnostic test. In a preferred embodiment, gene expression monitoring is 

25 performed simultaneously on a number of genes. 

The lung cancer nucleic acid probes may be attached to biochips as outlined herein for 
the detection and quantification of lung cancer sequences in a particular cell. The assays are 
further described below in the example. PCR techniques can be used to provide greater 
sensitivity. Multiple protein expression monitoring can be performed as well. Similarly, 

3 0 these assays may b e performed on an individual basis as well. 

In a preferred embodiment nucleic acids encoding the lung cancer protein are 
detected. Although DNA or RNA encoding the lung cancer protein may be detected, of 
particular interest are methods wherein an mRNA encoding a lung cancer protein is detected. 
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Probes to detect mRNA can be a nucleotide/deoxynucleotide probe that is complementary to 
and hybridizes with the mRNA and includes, but is not limited to, oligonucleotides, cDNA or 
RNA Probes also should contain a detectable label, as defined herein. In one method the 
mRNA is detected after immobilizing the nucleic acid to be examined on a solid support such 
5 as nylon membranes and hybridizing the probe with the sample. Following washing to 

remove the non-specifically bound probe, the label is detected. In another method detection 
of the mRNA is performed in situ. In this method permeabilized cells or tissue samples are 
contacted with a detectably labeled nucleic acid probe for sufficient time to allow the probe 
to hybridize with the target mRNA. Following washing to remove the non-specifically bound 

10 probe, the label is detected. For example a digoxygenin labeled riboprobe (RNA probe) that 
is complementary to the mRNA encoding a lung cancer protein is detected by binding the 
digoxygenin with an anti-digoxygenin secondary antibody and developed with nitro blue 
tetrazolium and 5-bromo-4-chloro-3-indoyl phosphate. 

In a preferred embodiment, various proteins from the three classes of proteins as 

15 described herein (secreted, transmembrane or intracellular proteins) are used in diagnostic 
assays. The lung cancer proteins, antibodies, nucleic acids, modified proteins and cells 
containing lung cancer sequences are used in diagnostic assays. This can be performed on an 
individual gene or corresponding polypeptide level. In a preferred embodiment, the 
expression profiles are used, preferably in conjunction with high throughput screening 

20 techniques to allow monitoring for expression profile genes and/or corresponding 
polypeptides. 

As described and defined herein, lung cancer proteins, including intracellular, 
transmembrane, or secreted proteins, find use as markers of lung cancer, e.g., for prognostic 
or diagnostic purposes. Detection of these proteins in putative lung cancer tissue allows for 

25 detection, prognosis, or diagnosis of lung cancer or similar disease, and perhaps for selection 
of therapeutic strategy. In one embodiment, antibodies are used to detect lung cancer 
proteins. A preferred method separates proteins from a sample by electrophoresis on a gel 
(typically a denaturing and reducing protein gel, but may be another type of gel, including 
isoelectric focusing gels and the like). Following separation of proteins, the lung cancer 

30 protein is detected, e.g., by immunoblotting with antibodies raised against the lung cancer 
protein. Methods of immunoblotting are well known to those of ordinary skill in the art. 

In another preferred method, antibodies to the lung cancer protein find use in in situ 
imaging techniques, e.g., in histology (e.g., Asai (ed. 1993) Methods in Cell Biology: 
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Antibodies in Cell Biology, volume 37. In this method cells are contacted with from one to 
many antibodies to the lung cancer protein(s). Following washing to remove non-specific 
antibody binding, the presence of the antibody or antibodies is detected. In one embodiment 
the antibody is detected by incubating with a secondary antibody that contains a detectable 
5 label, e.g., multicolor fluorescence or confocal imaging. In another method the primary 

antibody to the lung cancer protein(s) contains a detectable label, e.g., an enzyme marker that 
can act on a substrate. In another preferred embodiment each one of multiple primary 
antibodies contains a distinct and detectable label. This method finds particular use in 
simultaneous screening for a plurality of lung cancer proteins. Many other histological 

10 imaging techniques are also provided by the invention. 

In a preferred embodiment the label is detected in a fluorometer which has the ability 
to detect and distinguish emissions of different wavelengths. In addition, a fluorescence 
activated cell sorter (FACS) can be used in the method. 

In another preferred embodiment, antibodies find use in diagnosing lung cancer from 

15 blood, serum, plasma, stool, and other samples. Such samples, therefore, are useful as 

samples to be probed or tested for the presence of lung cancer proteins. Antibodies can be 
used to detect a lung cancer protein by previously described immunoassay techniques 
including ELISA, immunoblotting (western blotting), immunoprecipitation, BIACORE 
technology and the like. Conversely, the presence of antibodies may indicate an immune 

20 response against an endogenous lung cancer protein or vaccine. 

In a preferred embodiment, in situ hybridization of labeled lung cancer nucleic acid 
probes to tissue arrays is done. For example, arrays of tissue samples, including lung cancer 
tissue and/or normal tissue, are made. In situ hybridization (see, e.g., Ausubel, supra) is then 
performed. When comparing the fingerprints between an individual and a standard, the 

25 skilled artisan can make a diagnosis, a prognosis, or a prediction based on the findings. It is 
further understood that the genes which indicate the diagnosis may differ from those which 
indicate the prognosis and molecular profiling of the condition of the cells may lead to 
distinctions between responsive or refractory conditions or may be predictive of outcomes. 
In a preferred embodiment, the lung cancer proteins, antibodies, nucleic acids, 

30 modified proteins and cells containing lung cancer sequences are used in prognosis assays. 
As above, gene expression profiles can be generated that correlate to lung cancer, clinical, 
pathological, or other information, in terms of long term prognosis. Again, this may be done 
on either a protein or gene level, with the use of genes being preferred. Single or multiple 
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genes may be useful in various combinations. As above, lung cancer probes may be attached 
to biochips for the detection and quantification of lung cancer sequences in a tissue or patient. 
The assays proceed as outlined above for diagnosis. PCR method may provide more 
sensitive and accurate quantification. 

5 

Assays for therapeutic compounds 

In a preferred embodiment, the proteins, nucleic acids, and antibodies as described 
herein are used in drug screening assays. The lung cancer proteins, antibodies, nucleic acids, 
modified proteins and cells containing lung cancer sequences are used in drug screening 

10 assays or by evaluating the effect of drug candidates on a "gene expression profile" or 

expression profile of polypeptides. In a preferred embodiment, the expression profiles are 
used, preferably in conjunction with high throughput screening techniques to allow 
monitoring for expression profile genes after treatment with a candidate agent (e.g., 
Zlokarnik, et al. (1998) Science 279:84-8; Heid (1996) Genome Res. 6:986-94. 

15 In a preferred embodiment, the lung cancer proteins, antibodies, nucleic acids, 

modified proteins and cells containing the native or modified lung cancer proteins are used in 
screening assays. That is, the present invention provides novel methods for screening for 
compositions which modulate the lung cancer phenotype or an identified physiological 
function of a lung cancer protein. As above, this can be done on an individual gene level or 

20 by evaluating the effect of drug candidates on a "gene expression profile". In a preferred 

embodiment, the expression profiles are used, preferably in conjunction with high throughput 
screening techniques to allow monitoring for expression profile genes after treatment with a 
candidate agent, see Zlokarnik, supra. 

Having identified differentially expressed genes herein, a variety of assays may be 

25 performed. In a preferred embodiment, assays may be run on an individual gene or protein 
level. That is, having identified a particular gene with altered regulation in lung cancer, test 
compounds can be screened for the ability to modulate gene expression or for binding to the 
lung cancer protein. "Modulation" thus includes an increase or a decrease in gene 
expression. The preferred amount of modulation will depend on the original change of the 

30 gene expression in normal versus tissue undergoing lung cancer, with changes of at least 

10%, preferably 50%, more preferably 100-300%, and in some embodiments 300-1000% or 
greater. Thus, if a gene exhibits a 4-fold increase in lung cancer tissue compared to normal 
tissue, a decrease of about four-fold is often desired; similarly, a 10-fold decrease in lung 
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cancer tissue compared to normal tissue often provides a target value of a 10-fold increase in 
expression to be induced by the test compound. 

The amount of gene expression may be monitored using nucleic acid probes and the 
quantification of gene expression levels, or, alternatively, the gene product itself can be 
5 monitored, e.g., through the use of antibodies to the lung cancer protein and standard 
immunoassays. Proteomics and separation techniques may also allow quantification of 
expression. 

In a preferred embodiment, gene or protein expression monitoring of a number of 
entities, i.e., an expression profile, is monitored simultaneously. Such profiles will typically 
1 0 involve a plurality of those entities described herein. 

In this embodiment, the lung cancer nucleic acid probes are attached to biochips as 
outlined herein for the detection and quantification of lung cancer sequences in a particular 
cell. Alternatively, PCR may be used. Thus, a series, e.g., of microtiter plate, may be used 
with dispensed primers in desired wells. A PCR reaction can then be performed and analyzed 
15 for each well. 

Expression monitoring can be performed to identify compounds that modify the 
expression of one or more lung cancer-associated sequences, e.g., a polynucleotide sequence 
set out in the tables. Generally, in a preferred embodiment, a test compound is added to the 
cells prior to analysis. Moreover, screens are also provided to identify agents that modulate 

20 lung cancer, modulate lung cancer proteins, bind to a lung cancer protein, or interfere with 
the binding of a lung cancer protein and an antibody, substrate, or other binding partner. 

The term "test compound" or "drug candidate" or "modulator" or grammatical 
equivalents as used herein describes a molecule, e.g., protein, oligopeptide, small organic 
molecule, polysaccharide, polynucleotide, etc., to be tested for the capacity to directly or 

25 indirectly alter the lung cancer phenotype or the expression of a lung cancer sequence, e.g., a 
nucleic acid or protein sequence. In preferred embodiments, modulators alter expression 
profiles of nucleic acids or proteins provided herein. In one embodiment, the modulator 
suppresses a lung cancer phenotype, e.g., to a normal or non-malignant tissue fingerprint. In 
another embodiment, a modulator induces a lung cancer phenotype. Generally, a plurality of 

30 assay mixtures are run in parallel with different agent concentrations to obtain a differential 
response to the various concentrations. Typically, one of these concentrations serves as a 
negative control, i.e., at zero concentration or below the level of detection. 
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In one aspect, a modulator will neutralize the effect of a lung cancer protein. By 
"neutralize" is meant that activity of a protein and the consequent effect on the cell is 
inhibited or blocked. 

In certain embodiments, combinatorial libraries of potential modulators will be 
5 screened for an ability to bind to a lung cancer polypeptide or to modulate activity. 

Conventionally, new chemical entities with useful properties are generated by identifying a 
chemical compound (called a "lead compound") with some desirable property or activity, 
e.g., inhibiting activity, creating variants of the lead compound, and evaluating the property 
and activity of those variant compounds. Often, high throughput screening (HTS) methods 

10 are employed for such an analysis. 

In one preferred embodiment, high throughput screening methods involve providing a 
library containing a large number of potential therapeutic compounds (candidate 
compounds). Such "combinatorial chemical libraries" are then screened in one or more 
assays to identify those library members (particular chemical species or subclasses) that 

15 display a desired characteristic activity. The compounds thus identified can serve as 

conventional "lead compounds" or can themselves be used as potential or actual therapeutics. 

A combinatorial chemical library is a collection of diverse chemical compounds 
generated by either chemical synthesis or biological synthesis by combining a number of 
chemical "building blocks" such as reagents. For example, a linear combinatorial chemical 

10 library, such as a polypeptide (e.g., mutein) library, is fonned by combining a set of chemical 
building blocks called amino acids in every possible way for a given compound length (i.e., 
the number of amino acids in a polypeptide compound). Millions of chemical compounds 
can be synthesized through such combinatorial mixing of chemical building blocks (Gallop, 
etal. (1994) LMed^Chem 37(9): 1233-1251). 

15 Preparation and screening of combinatorial chemical libraries is well known to those 

of skill in the art. Such combinatorial chemical libraries include, but are not limited to, 
peptide libraries (see, e.g., U.S. Patent No. 5,010,175, Furka (1991) Pept. Prot. Res. 37:487- 
493, Houghton, et al. (1991) Nature . 354:84-88), peptoids (PCT Publication No WO 
91/19735), encoded peptides (PCT Publication WO 93/20242), random bio-oligomers (PCT 

SO Publication WO 92/00091), benzodiazepines (U.S. Pat. No. 5,288,5 14), diversomers such as 
hydantoins, benzodiazepines and dipeptides (Hobbs, et al. (1993) Proc. Nat. Acad. Sci. USA 
90:6909-6913), vinylogous polypeptides (Hagihara, et al. (1992) J. Amer. Chem. Soc. 
1 14:6568), nonpeptidal peptidomimetics with a Beta-D-Glucose scaffolding (Hirschmann, et 
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al. (1992) J. Amer. Chem. Soc. 114:9217-9218), analogous organic syntheses of small 
compound libraries (Chen, et al. (1994) J. Amer. Chem. Soc. 116:2661), oligocarbamates 
(Cho, et al. (1993) Science 261:1303), and/or peptidyl phosphonates (Campbell, et al. (1994) 
J. Org. Chem. 59:658). See, generally, Gordon, et al. (1994) J. Med. Chem. 37:1385, nucleic 
5 acid libraries (see, e.g., Stratagene, Corp.), peptide nucleic acid libraries (see, e.g., U.S. 
Patent 5,539,083), antibody libraries (see, e.g., Vaughn, et al. (1996) Nature Biotechnology 
14(3):309-314, and PCT/US 96/ 10287), carbohydrate libraries (see, e.g., Liang, et al. (1996) 
Science 274:1520-1522, and U.S. Patent No. 5,593,853), and small organic molecule libraries 
(see, e.g., benzodiazepines, Baum (1993) C&EN, Jan 18, page 33; isoprenoids, U.S. Patent 

10 No. 5,569,588; thiazolidinones and metathiazanones, U.S. Patent No. 5,549,974; pyrrolidines, 
U.S. Patent Nos. 5,525,735 and 5,519,134; morpholino compounds, U.S. Patent No. 
5,506,337; benzodiazepines, U.S. Patent No. 5,288,514; and the like). 

Devices for the preparation of combinatorial libraries are commercially available (see, 
e.g., 357 MPS, 390 MPS, Advanced Chem Tech, Louisville KY, Symphony, Rainin, 

1 5 Woburn, MA, 433 A Applied Biosystems, Foster City, CA, 9050 Plus, Millipore, Bedford, 
MA). 

A number of well known robotic systems have also been developed for solution phase 
chemistries. These systems include automated workstations like the automated synthesis 
apparatus developed by Takeda Chemical Industries, LTD. (Osaka, Japan) and many robotic 

20 systems utilizing robotic arms (Zymate II, Zymark Corporation, Hopkinton, Mass.; Orca, 

Hewlett-Packard, Palo Alto, Calif.), which mimic the manual synthetic operations performed 
by a chemist. The above devices, with appropriate modification, are suitable for use with the 
present invention. In addition, numerous combinatorial libraries are themselves 
commercially available (see, e.g., ComGenex, Princeton, N.J., Asinex, Moscow, Ru, Tripos, 

25 Inc., St. Louis, MO, ChemStar, Ltd, Moscow, RU, 3D Pharmaceuticals, Exton, PA, Martek 
Biosciences, Columbia, MD, etc.). 

The assays to identify modulators are amenable to high throughput screening. 
Preferred assays thus detect modulation of lung cancer gene transcription, polypeptide 
expression, and polypeptide activity. 

30 High throughput assays for evaluating the presence, absence, quantification, or other 

properties of particular nucleic acids or protein products are well known to those of skill in 
the art. Similarly, binding assays and reporter gene assays are similarly well known. Thus, 
e.g., U.S. Patent No. 5,559,410 discloses high throughput screening methods for proteins, 
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U.S. Patent No. 5,585,639 discloses high throughput screening methods for nucleic acid 
binding (i.e., in arrays), while U.S. Patent Nos. 5,576,220 and 5,541,061 disclose high 
throughput methods of screening for ligand/antibody binding. 

In addition, high throughput screening systems are commercially available (see, e.g., 
5 Zymark Corp., Hopkinton, MA; Air Technical Industries, Mentor, OH; Beckman 

Instruments, Inc. Fullerton, CA; Precision Systems, Inc., Natick, MA, etc.). These systems 
typically automate procedures, including sample and reagent pipetting, liquid dispensing, 
timed incubations, and final readings of the microplate in detector(s) appropriate for the 
assay. These configurable systems provide high throughput and rapid start up as well as a 

1 0 high degree of flexibility and customization. The manufacturers of such systems provide 
detailed protocols for various high throughput systems. Thus, e.g., Zymark Corp. provides 
technical bulletins describing screening systems for detecting the modulation of gene 
transcription, ligand binding, and the like. 

In one embodiment, modulators are proteins, often naturally occurring proteins or 

15 fragments of naturally occurring proteins. Thus, e.g., cellular extracts containing proteins, or 
random or directed digests of proteinaceous cellular extracts, may be used. In this way 
libraries of proteins may be made for screening in the methods of the invention. Particularly 
preferred in this embodiment are libraries of bacterial, fungal, viral, and mammalian proteins, 
with the latter being preferred, and human proteins being especially preferred. Particularly 

20 useful test compound will be directed to the class of proteins to which the target belongs, e.g., 
substrates for enzymes or ligands and receptors. 

In a preferred embodiment, modulators are peptides of from about 5 to about 30 
amino acids, with from about 5 to about 20 amino acids being preferred, and from about 7 to 
about 15 being particularly preferred. The peptides may be digests of naturally occurring 

25 proteins, random peptides, or "biased" random peptides. By "randomized" or grammatical 
equivalents herein is meant that the nucleic acid or peptide consists of essentially random 
sequences of nucleotides and amino acids, respectively. Since these random peptides (or 
nucleic acids, discussed below) are often chemically synthesized, they may incorporate a 
nucleotide or amino acid at any position. The synthetic process can be designed to generate 

30 randomized proteins or nucleic acids, to allow the formation of all or most of the possible 

combinations over the length of the sequence, thus forming a library of randomized candidate 
bioactive proteinaceous agents. 
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In one embodiment, the library is fully randomized, with no sequence preferences or 
constants at any position. In a preferred embodiment, the library is biased. That is, some 
positions within the sequence are either held constant, or are selected from a limited number 
of possibilities. In a preferred embodiment, the nucleotides or amino acid residues are 
5 randomized within a defined class, e.g., of hydrophobic amino acids, hydrophilic residues, 
sterically biased (either small or large) residues, towards the creation of nucleic acid binding 
domains, the creation of cysteines, for cross-linking, prolines for SH-3 domains, serines, 
threonines, tyrosines or histidines for phosphorylation sites, etc. 

Modulators of lung cancer can also be nucleic acids, as defined above. 

1 0 As described above generally for proteins, nucleic acid modulating agents may be 

naturally occurring nucleic acids, random nucleic acids, or "biased" random nucleic acids. 
Digests of procaryotic or eucaryotic genomes may be used as is outlined above for proteins. 

In a preferred embodiment, the candidate compounds are organic chemical moieties, a 
wide variety of which are available in the literature. 

1 5 After a candidate agent has been added and the cells allowed to incubate for some 

period of time, the sample containing a target sequence is analyzed. If required, the target 
sequence is prepared using known techniques. For example, the sample may be treated to 
lyse the cells, using known lysis buffers, electroporation, etc., with purification and/or 
amplification such as PCR performed as appropriate. For example, an in vitro transcription 

20 with labels covalently attached to the nucleotides is performed. Generally, the nucleic acids 
are labeled with biotin-FITC or PE, or with cy3 or cy5. 

In a preferred embodiment, the target sequence is labeled with, e.g., a fluorescent, a 
chemiluminescent, a chemical, or a radioactive signal, to provide a means of detecting the 
target sequence's specific binding to a probe. The label also can be an enzyme, such as, 

25 alkaline phosphatase or horseradish peroxidase, which when provided with an appropriate 
substrate produces a product that can be detected. Alternatively, the label can be a labeled 
compound or small molecule, such as an enzyme inhibitor, that binds but is not catalyzed or 
altered by the enzyme. The label also can be a moiety or compound, such as, an epitope tag 
or biotin which specifically binds to streptavidin. For the example of biotin, the streptavidin 

30 is labeled as described above, thereby, providing a detectable signal for the bound target 
sequence. Unbound labeled streptavidin is typically removed prior to analysis. 

Nucleic acid assays can be direct hybridization assays or can comprise "sandwich 
assays", which include the use of multiple probes, as is generally outlined in U.S. Patent Nos. 
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5,681,702, 5,597,909, 5,545,730, 5,594,117, 5,591,584, 5,571,670, 5,580,731, 5,571,670, 

5,591,584, 5,624,802, 5,635,352, 5,594,118, 5,359,100, 5,124,246 and 5,681,697, all of 

which are hereby incorporated by reference. In this embodiment, in general, the target nucleic 

acid is prepared as outlined above, and then added to the biochip comprising a plurality of 

5 nucleic acid probes, under conditions that allow the formation of a hybridization complex. 

A variety of hybridization conditions may be used in the present invention, including 

high, moderate and low stringency conditions as outlined above. The assays are generally 

run under stringency conditions which allow formation of the label probe hybridization 

complex only in the presence of target. Stringency can be controlled by altering a step 

10 parameter that is a thermodynamic variable, including, but not limited to, temperature, 
formamide concentration, salt concentration, chaotropic salt concentration, pH, organic 
solvent concentration, etc. 

These parameters may also be used to control non-specific binding, as is generally 
outlined in U.S. Patent No. 5,681,697; Thus it may be desirable to perform certain steps at 

1 5 higher stringency conditions to reduce non-specific binding. 

The reactions outlined herein may be accomplished in a variety of ways. Components 
of the reaction may be added simultaneously, or sequentially, in different orders, with 
preferred embodiments outlined below. In addition, the reaction may include a variety of 
other reagents. These include salts, buffers, neutral proteins, e.g., albumin, detergents, etc. 

20 which may be used to facilitate optimal hybridization and detection, and/or reduce non- 
specific or background interactions. Reagents that otherwise improve the efficiency of the 
assay, such as protease inhibitors, nuclease inhibitors, anti-microbial agents, etc., may also be 
used as appropriate, depending on the sample preparation methods and purity of the target. 
The assay data are analyzed to determine the expression levels, and changes in 

25 expression levels as between states, of individual genes, forming a gene expression profile. 

Screens are performed to identify modulators of the lung cancer phenotype. In one 
embodiment, screening is performed to identify modulators that can induce or suppress a 
particular expression profile, thus preferably generating the associated phenotype. In another 
embodiment, e.g., for diagnostic applications, having identified differentially expressed genes 

30 important in a particular state, screens can be performed to identify modulators that alter 

expression of individual genes. In an another embodiment, screening is performed to identify 
modulators that alter a biological function of the expression product of a differentially 
expressed gene. Again, having identified the importance of a gene in a particular state, 
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screens are performed to identify agents that bind and/or modulate the biological activity of 
the gene product, or evaluate genetic polymorphisms. 

Genes can be screened for those that are induced in response to a candidate agent. 
After identifying a modulator based upon its ability to suppress a lung cancer expression 
5 pattern leading to a normal expression pattern, or to modulate a single lung cancer gene 
expression profile so as to mimic the expression of the gene from normal tissue, a screen as 
described above can be performed to identify genes that are specifically modulated in 
response to the agent. Comparing expression profiles between normal tissue and agent 
treated lung cancer tissue reveals genes that are not expressed in normal tissue or lung cancer 

10 tissue, but are expressed in agent treated tissue. These agent-specific sequences can be 
identified and used by methods described herein for lung cancer genes or proteins. In 
particular these sequences and the proteins they encode find use in marking or identifying 
agent treated cells. In addition, antibodies can be raised against the agent induced proteins 
and used to target novel therapeutics to the treated lung cancer tissue sample. 

1 5 Thus, in one embodiment, a test compound is administered to a population of lung 

cancer cells, that have an associated lung cancer expression profile. By "administration" or 
"contacting" herein is meant that the candidate agent is added to the cells in such a manner as 
to allow the agent to act upon the cell, whether by uptake and intracellular action, or by 
action at the cell surface. In some embodiments, nucleic acid encoding a proteinaceous 

20 candidate agent (i.e., a peptide) may be put into a viral construct such as an adenoviral or 
retroviral construct, and added to the cell, such that expression of the peptide agent is 
accomplished, e.g., PCT US97/01019: Regulatable gene therapy systems can also be used. 

Once a test compound has been administered to the cells, the cells can be washed if 
desired and are allowed to incubate under preferably physiological conditions for some 

25 period of time. The cells are then harvested and a new gene expression profile is generated, 
as outlined herein. 

Thus, e.g., lung cancer or non-malignant tissue may be screened for agents that 
modulate, e.g., induce or suppress a lung cancer phenotype. A change in at least one gene, 
preferably many, of the expression profile indicates that the agent has an effect on lung 
30 cancer activity. By defining such a signature for the lung cancer phenotype, screens for new 
drugs that alter the phenotype can be devised. With this approach, the drug target need not be 
known and need not be represented in the original expression screening platform, nor does 
the level of transcript for the target protein need to change. 



62 



WO 02/086443 PCT/US02/12476 

Measure of lung cancer polypeptide activity, or of lung cancer or the lung cancer 
phenotype can be performed using a variety of assays. For example, the effects of the test 
compounds upon the function of the metastatic polypeptides can be measured by examining 
parameters described above. A suitable physiological change that affects activity can be used 
5 to assess the influence of a test compound on the polypeptides of this invention. When the 
functional consequences are determined using intact cells or animals, one can also measure a 
variety of effects such as, in the case of lung cancer associated with tumors, tumor growth, 
tumor metastasis, neovascularization, hormone release, transcriptional changes to both known 
and uncharacterized genetic markers (e.g., northern blots), changes in cell metabolism such as 

10 cell growth or pH changes, and changes in intracellular second messengers such as cGMP. In 
the assays of the invention, mammalian lung cancer polypeptide is typically used, e.g., 
mouse, preferably human. 

Assays to identify compounds with modulating activity can be performed in vitro. 
For example, a lung cancer polypeptide is first contacted with a potential modulator and 

15 incubated for a suitable amount of time, e.g., from 0.5 to 48 hours. In one embodiment, the 
lung cancer polypeptide levels are determined in vitro by measuring the level of protein or 
mRNA. The level of protein is typically measured using immunoassays such as western 
blotting, ELISA and the like with an antibody that selectively binds to the lung cancer 
polypeptide or a fragment thereof. For measurement of mRNA, amplification, e.g., using 

20 PCR, LCR, or hybridization assays, e.g., northern hybridization, RNAse protection, dot 

blotting, are preferred. The level of protein or mRNA is typically detected using directly or 
indirectly labeled detection agents, e.g., fluorescently or radioactively labeled nucleic acids, 
radioactively or enzymatically labeled antibodies, and the like, as described herein. 

Alternatively, a reporter gene system can be devised using a lung cancer protein 

25 promoter operably linked to a reporter gene such as luciferase, green fluorescent protein, 
CAT, or p-gal. The reporter construct is typically transfected into a cell. After treatment 
with a potential modulator, the amount of reporter gene transcription, translation, or activity 
is measured according to standard techniques known to those of skill in the art. 

In a preferred embodiment, as outlined above, screens may be done on individual 

30 genes and gene products (proteins). That is, having identified a particular differentially 

expressed gene as important in a particular state, screening of modulators of the expression of 
the gene or the gene product itself can be done. The gene products of differentially expressed 
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genes are sometimes referred to herein as "lung cancer proteins." The lung cancer protein 
may be a fragment, or alternatively, be the full length protein to a fragment shown herein. 

In one embodiment, screening for modulators of expression of specific genes is 
performed. Typically, the expression of only one or a few genes are evaluated. In another 
5 embodiment, screens are designed to first find compounds that bind to differentially 
expressed proteins. These compounds are then evaluated for the ability to modulate 
differentially expressed activity. Moreover, once initial candidate compounds are identified, 
variants can be further screened to better evaluate structure activity relationships. 

In a preferred embodiment, binding assays are done. In general, purified or isolated 
10 gene product is used; that is, the gene products of one or more differentially expressed 

nucleic acids are made. For example, antibodies are generated to the protein gene products, 
and standard immunoassays are run to determine the amount of protein present. Alternatively, 
cells comprising the lung cancer proteins can be used in the assays. 

Thus, in a preferred embodiment, the methods comprise combining a lung cancer 
1 5 protein and a candidate compound, and determining the binding of the compound to the lung 
cancer protein. Preferred embodiments utilize the human lung cancer protein, although other 
mammalian proteins may also be used, e.g., for the development of animal models of human 
disease. In some embodiments, as outlined herein, variant or derivative lung cancer proteins 
may be used. 

20 Generally, in a preferred embodiment of the methods herein, the lung cancer protein 

or the candidate agent is non-diffusably bound to an insoluble support, preferably having 
isolated sample receiving areas (e.g., a microtiter plate, an array, etc.). The insoluble 
supports may be made of a composition to which the compositions can be bound, is readily 
separated from soluble material, and is otherwise compatible with the overall method of 

25 screening. The surface of such supports may be solid or porous and of a convenient shape. 
Examples of suitable insoluble supports include microtiter plates, arrays, membranes and 
beads. These are typically made of glass, plastic (e.g., polystyrene), polysaccharides, nylon 
or nitrocellulose, teflon™, etc. Microtiter plates and arrays are especially convenient because 
a large number of assays can be carried out simultaneously, using small amounts of reagents 

30 and samples. The particular manner of binding of the composition is typically not crucial so 
long as it is compatible with the reagents and overall methods of the invention, maintains the 
activity of the composition, and is nondiffusable. Preferred methods of binding include the 
use of antibodies (which do not sterically block either the ligand binding site or activation 
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sequence when the protein is bound to the support), direct binding to "sticky" or ionic 
supports, chemical crosslinking, the synthesis of the protein or agent on the surface, etc. 
Following binding of the protein or agent, excess unbound material is removed by washing. 
The sample receiving areas may then be blocked through incubation with bovine serum 
5 albumin (BSA), casein or other innocuous protein or other moiety. 

In a preferred embodiment, the lung cancer protein is bound to the support, and a test 
compound is added to the assay. Alternatively, the candidate agent is bound to the support 
and the lung cancer protein is added. Novel binding agents include specific antibodies, non- 
natural binding agents identified in screens of chemical libraries, peptide analogs, etc. Of 

10 particular interest are screening assays for agents that have a low toxicity for human cells. A 
wide variety of assays may be used for this purpose, including labeled in vitro protein-protein 
binding assays, electrophoretic mobility shift assays, immunoassays for protein binding, 
functional assays (phosphorylation assays, etc.) and the like. 

The determination of the binding of the test modulating compound to the lung cancer 

1 5 protein may be done in a number of ways. In a preferred embodiment, the compound is 

labeled, and binding determined directly, e.g., by attaching all or a portion of the lung cancer 
protein to a solid support, adding a labeled candidate agent (e.g., a fluorescent label), washing 
off excess reagent, and determining whether the label is present on the solid support. Various 
blocking and washing steps may be utilized as appropriate. 

20 In some embodiments, only one of the components is labeled, e.g., the proteins (or 

proteinaceous candidate compounds) can be labeled. Alternatively, more than one 
component can be labeled with different labels, e.g., 125 I for the proteins and a fluorophor for 
the compound. Proximity reagents, e.g., quenching or energy transfer reagents are also 
useful. 

25 In one embodiment, the binding of the test compound is determined by competitive 

binding assay. The competitor may be a binding moiety known to bind to the target molecule 
(i.e., a lung cancer protein), such as an antibody, peptide, binding partner, ligand, etc. Under 
certain circumstances, there may be competitive binding between the compound and the 
binding moiety, with the binding moiety displacing the compound. In one embodiment, the 

30 test compound is labeled. Either the compound, or the competitor, or both, is added first to 
the protein for a time sufficient to allow binding, if present. Incubations may be performed at 
a temperature which facilitates optimal activity, typically between 4 and 40° C. Incubation 
periods are typically optimized, e.g., to facilitate rapid high throughput screening. Typically 
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between 0.1 and 1 hour will be sufficient. Excess reagent is generally removed or washed 
away. The second component is then added, and the presence or absence of the labeled 
component is followed, to indicate binding. 

In a preferred embodiment, the competitor is added first, followed by a test 
5 compound. Displacement of the competitor is an indication that the test compound is binding 
to the lung cancer protein and thus is capable of binding to, and potentially modulating, the 
activity of the lung cancer protein. In this embodiment, either component can be labeled. 
Thus, e.g., if the competitor is labeled, the presence of label in the wash solution indicates 
displacement by the agent. Alternatively, if the test compound is labeled, the presence of the 

1 0 label on the support indicates displacement. 

In an alternative embodiment, the test compound is added first, with incubation and 
washing, followed by the competitor. The absence of binding by the competitor may indicate 
that the test compound is bound to the lung cancer protein with a higher affinity. Thus, if the 
test compound is labeled, the presence of the label on the support, coupled with a lack of • 

15 competitor binding, may indicate that the test compound is capable of binding to the lung 
cancer protein. 

In a preferred embodiment, the methods comprise differential screening to identity 
agents that are capable of modulating the activity of the lung cancer proteins. In one 
embodiment, the methods comprise combining a lung cancer protein and a competitor in a 

20 first sample. A second sample comprises a test compound, a lung cancer protein, and a 

competitor. The binding of the competitor is determined for both samples, and a change, or 
difference in binding between the two samples indicates the presence of an agent capable of 
binding to the lung cancer protein and potentially modulating its activity. That is, if the 
binding of the competitor is different in the second sample relative to the first sample, the 

25 agent is capable of binding to the lung cancer protein. 

Alternatively, differential screening is used to identify drug candidates that bind to the 
native lung cancer protein, but cannot bind to modified lung cancer proteins. The structure of 
the lung cancer protein may be modeled, and used in rational drug design to synthesize agents 
that interact with that site. Drug candidates that affect the activity of a lung cancer protein 

30 are also identified by screening drugs for the ability to either enhance or reduce the activity of 
the protein. 

Positive controls and negative controls may be used in the assays. Preferably control 
and test samples are performed in at least triplicate to obtain statistically significant results. 
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Incubation of all samples is for a time sufficient for the binding of the agent to the protein. 
Following incubation, samples are washed free of non-specifically bound material and the 
amount of bound, generally labeled agent determined. For example, where a radiolabel is 
employed, the samples may be counted in a scintillation counter to determine the amount of 
5 bound compound. 

A variety of other reagents may be included in the screening assays. These include 
reagents like salts, neutral proteins, e.g., albumin, detergents, etc. which may be used to 
facilitate optimal protein-protein binding and/or reduce non-specific or background 
interactions. Also reagents that otherwise improve the efficiency of the assay, such as 
10 protease inhibitors, nuclease inhibitors, anti-microbial agents, etc., may be used. The mixture 
of components may be added in an order that provides for the requisite binding. 

In a preferred embodiment, the invention provides methods for screening for a 
compound capable of modulating the activity of a lung cancer protein. The methods 
comprise adding a test compound, as defined above, to a cell comprising lung cancer 
15 proteins. Preferred cell types include almost any cell. The cells contain a recombinant 
nucleic acid that encodes a lung cancer protein. In a preferred embodiment, a library of 
candidate agents are tested on a plurality of cells. 

In one aspect, the assays are evaluated in the presence or absence or previous or 
subsequent exposure of physiological signals, e.g., hormones, antibodies, peptides, antigens, 
20 cytokines, growth factors, action potentials, pharmacological agents including 

chemotherapeutics, radiation, carcinogenics, or other cells (e.g., cell-cell contacts). In another 
example, the determinations are determined at different stages of the cell cycle process. 

In this way, compounds that modulate lung cancer agents are identified. Compounds 
with pharmacological activity are able to enhance or interfere with the activity of the lung 
25 cancer protein. Once identified, similar structures are evaluated to identify critical structural 
feature of the compound. 

In one embodiment, a method of inhibiting lung cancer cell division is provided. The 
method comprises administration of a lung cancer inhibitor. In another embodiment, a 
method of inhibiting lung cancer is provided. The method may comprise administration of a 
30 lung cancer inhibitor. In a further embodiment, methods of treating cells or individuals with 
lung cancer are provided, e.g., comprising administration of a lung cancer inhibitor. 

In one embodiment, a lung cancer inhibitor is an antibody as discussed above. In 
another embodiment, the lung cancer inhibitor is an antisense molecule. 
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A variety of cell growth, proliferation, viability, and metastasis assays are known to 
those of skill in the art, as described below. 

Soft agar growth or colony formation in suspension 
5 Normal cells require a solid substrate to attach and grow. When the cells are 

transformed, they lose this phenotype and grow detached from the substrate. For example, 
transformed cells can grow in stirred suspension culture or suspended in semi-solid media, 
such as semi-solid or soft agar. The transformed cells, when transfected with tumor 
suppressor genes, regenerate normal phenotype and require a solid substrate to attach and 

10 grow. Soft agar growth or colony formation in suspension assays can be used to identify 
modulators of lung cancer sequences, which when expressed in host cells, inhibit abnormal 
cellular proliferation and transformation. A therapeutic compound would reduce or eliminate 
the host cells' ability to grow in stirred suspension culture or suspended in semi-solid media, 
such as semi-solid or soft. 

15 Techniques for soft agar growth or colony formation in suspension assays are 

described in Freshney (1994) Culture of Animal Cells a Manual of Basic Technique (3 rd ed.), 
herein incorporated by reference. See also, the methods section of Garkavtsev, et al. (1996), 
supra, herein incorporated by reference. 



20 Contact inhibition and density limitation of growth 

Normal cells typically grow in a flat and organized pattern in a petri dish until they 
touch other cells. When the cells touch one another, they are contact inhibited and stop 
growing. When cells are transformed, however, the cells are not contact inhibited and 
continue to grow to high densities in disorganized foci. Thus, the transformed cells grow to a 

25 higher saturation density than normal cells. This can be detected morphologically by the 
formation of a disoriented monolayer of cells or rounded cells in foci within the regular 
pattern of normal surrounding cells. Alternatively, labeling index with ( 3 H)-thymidine at 
saturation density can be used to measure density limitation of growth. See Freshney (1994), 
supra. The transformed cells, when transfected with tumor suppressor genes, regenerate a 

30 normal phenotype and become contact inhibited and would grow to a lower density. 

In this assay, labeling index with ( 3 H)-thymidine at saturation density is a preferred 
method of measuring density limitation of growth. Transformed host cells are transfected 
with a lung cancer-associated sequence and are grown for 24 hours at saturation density in 
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non-limiting medium conditions. The percentage of cells labeling with ( 3 H)-thymidine is 
determined autoradiographically. See, Freshney (1994), supra. 



Growth factor or serum dependence 
5 Transformed cells typically have a lower serum dependence than their normal 

counterparts (see, e.g., Temin (1966) J. Natl. Cancer Insti. 37:167-175; Eagle, et al. (1970) 1 
Exp. Med. 131:836-879); Freshney, supra. This is in part due to release of various growth 
factors by the transformed cells. Growth factor or serum dependence of transformed host 
cells can be compared with that of control. 

10 

Tumor specific markers levels 

Tumor cells release an increased amount of certain factors (hereinafter "tumor 
specific markers") than their normal counterparts. For example, plasminogen activator (PA) 
is released from human glioma at a higher level than from normal brain cells (see, e.g., 

15 Gullino, "Angiogenesis, tumor vascularization, and potential interference with tumor growth" 
in Mihich (ed. 1985) Biological Responses in Cancer , pp. 178-184). Similarly, Tumor 
angiogenesis factor (TAF) is released at a higher level in tumor cells than their normal 
counterparts. See, e.g., Folkman (1992) "Angiogenesis and Cancer" in Sem Cancer Biol.) . 
Various techniques which measure the release of these factors are described in 

20 Freshney (1994), supra. Also, see, Unkeless, et al. (1974) J. Biol. Chem. 249:4295-4305; 
Strickland and Beers (1976) J. Biol. Chem . 251:5694-5702; Whur, et al. (1980) Br. J. Cancer 
42:305-312; Gullino, "Angiogenesis, tumor vascularization, and potential interference with 
tumor growth" in Mihich (ed. 1985) Biological Responses in Cancer , pp. 178-184; Freshney 
Anticancer Res. 5:11 1-130 (1985). 

25 

Invasiveness into Matrigel 

The degree of invasiveness into Matrigel or some other extracellular matrix 
constituent can be used as an assay to identify compounds that modulate lung cancer- 
associated sequences. Tumor cells exhibit a good correlation between malignancy and 
30 invasiveness of cells into Matrigel or some other extracellular matrix constituent. In this 
assay, tumorigenic cells are typically used as host cells. Expression of a tumor suppressor 
gene in these host cells would decrease invasiveness of the host cells. 
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Techniques described in Freshney (1994), supra, can be used. Briefly, the level of 
invasion of host cells can be measured by using filters coated with Matrigel or some other 
extracellular matrix constituent. Penetration into the gel, or through to the distal side of the 
filter, is rated as invasiveness, and rated histologically by number of cells and distance 
5 moved, or by prelabeling the cells with 125 I and counting the radioactivity on the distal side of 
the filter or bottom of the dish. See, e.g., Freshney (1984), supra. 

Tumor growth in vivo 

Effects of lung cancer-associated sequences on cell growth can be tested in transgenic 

10 or immune-suppressed mice. Knock-out transgenic mice can be made, in which the lung 
cancer gene is disrupted or' in which a lung cancer gene is inserted. Knock-out transgenic 
mice can be made by insertion of a marker gene or other heterologous gene into the 
endogenous lung cancer gene site in the mouse genome via homologous recombination. 
Such mice can also be made by substituting the endogenous lung cancer gene with a mutated 

1 5 version of the lung cancer gene, or by mutating the endogenous lung cancer gene, e.g., by 
exposure to carcinogens. 

A DNA construct is introduced into the nuclei of embryonic stem cells. Cells 
containing the newly engineered genetic lesion are injected into a host mouse embryo, which 
is re-implanted into a recipient female. Some of these embryos develop into chimeric mice 

20 that possess germ cells partially derived from the mutant cell line. Therefore, by breeding the 
chimeric mice it is possible to obtain a new line of mice containing the introduced genetic 
lesion (see, e.g., Capecchi, et al. (1989) Science 244:1288). Chimeric targeted mice can be 
derived according to Hogan, et al. (1988) Manipulating the Mouse Embrvo: A L aboratory 
Manual . Cold Spring Harbor Laboratory and Robertson (ed. 1987) Teratocarcinomas and 

25 Embryonic Stem Cells: A Pr actical Approach. , IRL Press, Washington, D.C. 

Alternatively, various immune-suppressed or immune-deficient host animals can be 
used. For example, genetically athymic "nude" mouse (see, e.g., Giovanella, et al. (1974) J. 
Natl. Cancer Inst. 52:921), a SCID mouse, a thymectomized mouse, or an irradiated mouse 
(see, e.g., Bradley, et al. (1978) Br. J. Cancer 38:263: Selby, et al. (1980) BjiLCancer 41:52) 

30 can be used as a host. Transplantable tumor cells (typically about 10 6 cells*) injected into 

isogenic hosts will produce invasive tumors in a high proportions of cases, while normal cells 
of similar origin will not. In hosts which developed invasive tumors, cells expressing a lung 
cancer- associated sequences are injected subcutaneously. After a suitable length of time, 
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preferably 4-8 weeks, tumor growth is measured (e.g., by volume or by its two largest 
dimensions) and compared to the control. Tumors that have statistically significant reduction 
(using, e.g., Student's T test) are said to have inhibited growth. 

5 Polynucleotide modulators of lung cancer 

Antisense and RNAi Polynucleotides 

In certain embodiments, the activity of a lung cancer-associated protein is 
downregulated, or entirely inhibited, by the use of antisense or an inhibitory polynucleotide, 
i.e., a nucleic acid complementary to, and which can preferably hybridize specifically to, a 

1 0 coding mRNA nucleic acid sequence, e.g., a lung cancer protein rnRNA, or a subsequence 
thereof. Binding of the antisense polynucleotide to the mRNA reduces the translation and/or 
stability of the mRNA. 

In the context of this invention, antisense polynucleotides can comprise naturally- 
occurring nucleotides, or synthetic species formed from naturally-occurring subunits or their 

15 close homologs. Antisense polynucleotides may also have altered sugar moieties or inter- 
sugar linkages. Exemplary among these are the phosphorothioate and other sulfur containing 
species which are known for use in the art. Analogs are comprehended by this invention so 
long as they function effectively to hybridize with the lung cancer protein mRNA. See, e.g., 
Isis Pharmaceuticals, Carlsbad, CA; Sequitor, Inc., Natick, MA. 

20 Such antisense polynucleotides can readily be synthesized using recombinant means, 

or can be synthesized in vitro. Equipment for such synthesis is sold by several vendors, 
including Applied Biosystems. The preparation of other oligonucleotides such as 
phosphorothioates and alkylated derivatives is also well known to those of skill in the art. 
Antisense molecules as used herein include antisense or sense oligonucleotides. 

25 Sense oligonucleotides can, e.g., be employed to block transcription by binding to the anti- 
sense strand. The antisense and sense oligonucleotide comprise a single-stranded nucleic 
acid sequence (either RNA or DNA) capable of binding to target mRNA (sense) or DNA 
(antisense) sequences for lung cancer molecules. A preferred antisense molecule is for a lung 
cancer sequence in the tables, or for a ligand or activator thereof. Antisense or sense 

30 oligonucleotides, according to the present invention, comprise a fragment generally at least 
about 14 nucleotides, preferably from about 14 to 30 nucleotides. The ability to derive an 
antisense or a sense oligonucleotide, based upon a cDNA sequence encoding a given protein 
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is described in, e.g., Stein and Cohen (1988) Cancer Res. 48:2659 and van der Kxol, et al. 
(1988) BioTechniaues 6:958). 

RNA interference is a mechanism to suppress gene expression in a sequence specific 
manner. See, e.g., Brumelkamp, et al. (20021 Sciencexpress (21March2002); Sharp (1999) 
5 Genes Dev. 13:139-141: and Cathew (2001) Curr. Op. Cell Biol. 13:244-248. Inmammalian 
cells, short, e.g., 21 nt, double stranded small interfering RNAs (siRNA) have been shown to 
be effective at inducing an RNAi response. See, e.g., Elbashir, et al. (2001) Nature 41 1:494- 
498. The mechanism may be used to downregulate expression levels of identified genes, e.g., 
treatment of or validation of relevance to disease. 

10 

Ribozymes 

In addition to antisense polynucleotides, ribozymes can be used to target and inhibit 
transcription of lung cancer-associated nucleotide sequences. A ribozyme is an RNA 
molecule that catalytically cleaves other RNA molecules. Different kinds of ribozymes have 

15 been described, including group I ribozymes, hammerhead ribozymes, hairpin ribozymes, 
RNase P, and axhead ribozymes (see, e.g., Castanotto, et al. (1994) Adv. in Pharmacology 
25: 289-317 for a general review of the properties of different ribozymes). 

The general features of hairpin ribozymes are described, e.g., in Hampel, et al. (1990) 
Nucl. Acids Res. 18:299-304; European Patent Publication No. 0 360 257; U.S. Patent No. 

20 5,254,678. Methods of preparing are well known to those of skill in the art (see, e.g., WO 
94/26877; Ojwang, et al. (1993) Proc. Natl. Acad. Sci. USA 90:6340-6344; Yamada, et al. 
(1994) Human Gene Therapy 1 :39-45; Leavitt, et al. (1995) Proc. Natl. Acad. Sci. USA 
92:699-703; Leavitt, et al. (19994) Human Gene Therapy 5:1 151-120; and Yamada, et al. 
(1994) Virology 205: 121-126). 

25 Polynucleotide modulators of lung cancer may be introduced into a cell containing the 

target nucleotide sequence by formation of a conjugate with a ligand binding molecule, as 
described in WO 91/04753. Suitable ligand binding molecules include, but are not limited to, 
cell surface receptors, growth factors, other cytokines, or other ligands that bind to cell 
surface receptors. Preferably, conjugation of the ligand binding molecule does not 

30 substantially interfere with the ability of the ligand binding molecule to bind to its 

corresponding molecule or receptor, or block entry of the sense or antisense oligonucleotide 
or its conjugated version into the cell. Alternatively, a polynucleotide modulator of lung 
cancer may be introduced into a cell containing the target nucleic acid sequence, e.g., by 
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formation of an polynucleotide-lipid complex, as described in WO 90/10448. It is 
understood that the use of antisense molecules or knock out and knock in models may also be 
used in screening assays as discussed above, in addition to methods of treatment. 

Thus, in one embodiment, methods of modulating lung cancer in cells or organisms 
5 are provided. In one embodiment, the methods comprise administering to a cell an anti-lung 
cancer antibody that reduces or eliminates the biological activity of an endogenous lung 
cancer protein. Alternatively, the methods comprise administering to a cell or organism a 
recombinant nucleic acid encoding a lung cancer protein. This may be accomplished in any 
number of ways. In a preferred embodiment, e.g., when the lung cancer sequence is down- 

1 0 regulated in lung cancer, such state may be reversed by increasing the amount of lung cancer 
gene product in the cell. This can be accomplished, e.g., by overexpressing the endogenous 
lung cancer gene or administering a gene encoding the lung cancer sequence, using known 
gene-therapy techniques. In a preferred embodiment, the gene therapy techniques include the 
incorporation of the exogenous gene using enhanced homologous recombination (EHR), e.g., 

1 5 as described in PCT/US93/03 868, hereby incorporated by reference in its entirety. 

Alternatively, e.g., when the lung cancer sequence is up-regulated in lung cancer, the activity 
of the endogenous lung cancer gene is decreased, e.g., by the administration of a lung cancer 
antisense or RNAi nucleic acid. 

In one embodiment, the lung cancer proteins of the present invention may be used to 

20 generate polyclonal and monoclonal antibodies to lung cancer proteins. Similarly, the lung 
cancer proteins can be coupled, using standard technology, to affinity chromatography 
columns. These columns may then be used to purify lung cancer antibodies useful for 
production, diagnostic, or therapeutic purposes. In a preferred embodiment, the antibodies 
are generated to epitopes unique to a lung cancer protein; that is, the antibodies show little or 

25 no cross-reactivity to other proteins. The lung cancer antibodies may be coupled to standard 
affinity chromatography columns and used to purify lung cancer proteins. The antibodies 
may also be used as blocking polypeptides, as outlined above, since they will specifically 
bind to the lung cancer protein. 



30 Methods of identifying variant lung cancer-associated sequences 

Without being bound by theory, expression of various lung cancer sequences is 
correlated with lung cancer. Accordingly, disorders based on mutant or variant lung cancer 
genes may be determined. In one embodiment, the invention provides methods for 
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identifying cells containing variant lung cancer genes, e.g., determining all or part of the 
sequence of at least one endogenous lung cancer genes in a cell. In a preferred embodiment, 
the invention provides methods of identifying the lung cancer genotype of an individual, e.g., 
determining all or part of the sequence of at least one lung cancer gene of the individual. 
5 This is generally done in at least one tissue of the individual, and may include the evaluation 
of a number of tissues or different samples of the same tissue. The method may include 
comparing the sequence of the sequenced lung cancer gene to a known lung cancer gene, i.e., 
a wild-type gene. 

The sequence of all or part of the lung cancer gene can then be compared to the 
1 0 sequence of a known lung cancer gene to determine if any differences exist. This can be 
done using known homology programs, such as Bestfit, etc. In a preferred embodiment, the 
presence of a difference in the sequence between the lung cancer gene of the patient and the 
known lung cancer gene correlates with a disease state or a propensity for a disease state, as 
outlined herein. 

15 In a preferred embodiment, the lung cancer genes are used as probes to determine the 

number of copies of the lung cancer gene in the genome. 

In another preferred embodiment, the lung cancer genes are used as probes to 

determine the chromosomal localization of the lung cancer genes. Information such as 

chromosomal localization finds use in providing a diagnosis or prognosis in particular when 
20 chromosomal abnormalities such as translocations, and the like are identified in the lung 

cancer gene locus. 

Administration of pharmaceutical and vaccine compositions 

In one embodiment, a therapeutically effective dose of a lung cancer protein or 
25 modulator thereof, is administered to a patient. By "therapeutically effective dose" herein is 
meant a dose that produces effects for which it is administered. The exact dose will depend 
on the purpose of the treatment, and will be ascertainable by one skilled in the art using 
known techniques (e.g., Ansel, et al. (1992) Pharmaceutical Dosage Forms and Drug 
Delivery ; Lieberman, Pharmaceutical Dosage Forms (vols. 1-3), Dekker, ISBN 0824770846, 
30 082476918X, 0824712692, 0824716981; Lloyd (1999) The Art. Science and Technology of 
Pharmaceutical Compounding ; and Pickar (1999) Dosage Calculations ). Adjustments for 
lung cancer degradation, systemic versus localized delivery, and rate of new protease 
synthesis, as well as the age, body weight, general health, sex, diet, time of administration, 
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drug interaction and the severity of the condition may be necessary, and will be ascertainable 
with routine experimentation by those skilled in the art. 

A "patient" for the purposes of the present invention includes both humans and other 
animals, particularly mammals. Thus the methods are applicable to both human therapy and 
5 veterinary applications. In the preferred embodiment the patient is a mammal, preferably a 
primate, and in the most preferred embodiment the patient is human. 

The administration of the lung cancer proteins and modulators thereof of the present 
invention can be done in a variety of ways, including, but not limited to, orally, 
subcutaneously, intravenously, intranasally, transdermally, intraperitoneally, intramuscularly, 

10 intrapulmonary, vaginally, rectally, or intraocularly. In some instances, e.g., in the treatment 
of wounds and inflammation, the lung cancer proteins and modulators may be directly 
applied as a solution or spray. 

The pharmaceutical compositions of the present invention comprise a lung cancer 
protein in a form suitable for administration to a patient. In the preferred embodiment, the 

1 5 pharmaceutical compositions are in a water soluble form, such as being present as 

pharmaceutically acceptable salts, which is meant to include both acid and base addition 
salts. "Pharmaceutically acceptable acid addition salt" refers to those salts that retain the 
biological effectiveness of the free bases and that are not biologically or otherwise 
undesirable, formed with inorganic acids such as hydrochloric acid, hydrobromic acid, 

20 sulfuric acid, nitric acid, phosphoric acid and the like, and organic acids such as acetic acid, 
propionic acid, glycolic acid, pyruvic acid, oxalic acid, maleic acid, malonic acid, succinic 
acid, fumaric acid, tartaric acid, citric acid, benzoic acid, cinnamic acid, mandelic acid, 
methanesulfonic acid, ethanesulfonic acid, p-toluenesulfonic acid, salicylic acid and the like. 
"Pharmaceutically acceptable base addition salts" include those derived from inorganic bases 

25 such as sodium, potassium, lithium, ammonium, calcium, magnesium, iron, zinc, copper, 
manganese, aluminum salts and the like. Particularly preferred are the ammonium, 
potassium, sodium, calcium, and magnesium salts. Salts derived from pharmaceutically 
acceptable organic non-toxic bases include salts of primary, secondary, and tertiary amines, 
substituted amines including naturally occurring substituted amines, cyclic amines and basic 

30 ion exchange resins, such as isopropylamine, trimethylamine, diethylamine, triethylamine, 
tripropylamine, and ethanolamine. 

The pharmaceutical compositions may also include one or more of the following: 
carrier proteins such as serum albumin; buffers; fillers such as microcrystalline cellulose, 
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lactose, corn and other starches; binding agents; sweeteners and other flavoring agents; 
coloring agents; and polyethylene glycol. 

The pharmaceutical compositions can be administered in a variety of unit dosage 
forms depending upon the method of administration. For example, unit dosage forms 
5 suitable for oral administration include, but are not limited to, powder, tablets, pills, capsules 
and lozenges. It is recognized that lung cancer protein modulators (e.g., antibodies, antisense 
constructs, ribozymes, small organic molecules, etc.) when administered orally, should be 
protected from digestion. This is typically accomplished either by complexing the 
molecule(s) with a composition to render it resistant to acidic and enzymatic hydrolysis, or by 

10 packaging the molecule(s) in an appropriately resistant carrier, such as a liposome or a 
protection barrier. Means of protecting agents from digestion are well known in the art. 

The compositions for administration will commonly comprise a lung cancer protein 
modulator dissolved in a pharmaceutically acceptable carrier, preferably an aqueous carrier. 
A variety of aqueous carriers can be used, e.g., buffered saline and the like. These solutions 

1 5 are sterile and generally free of undesirable matter. These compositions may be sterilized by 
conventional, well known sterilization techniques. The compositions may contain 
pharmaceutically acceptable auxiliary substances as required to approximate physiological 
conditions such as pH adjusting and buffering agents, toxicity adjusting agents and the like, 
e.g., sodium acetate, sodium chloride, potassium chloride, calcium chloride, sodium lactate 

20 and the like. The concentration of active agent in these formulations can vary widely, and 
will be selected primarily based on fluid volumes, viscosities, body weight and the like in 
accordance with the particular mode of administration selected and the patient's needs (e.g., 
Remington's Pharmaceutical Science (15th ed., 1980) and Hardman, et al. (eds. 1996) 
Goodman and Gilman: The Pharmacologial Bas is of Therapeutics') . 

25 Thus, a typical pharmaceutical composition for intravenous administration would be 

about 0.1 to 10 mg per patient per day. Dosages from 0.1 up to about 100 mg per patient per 
day may be used, particularly when the drug is administered to a secluded site and not into 
the blood stream, such as into a body cavity or into a lumen of an organ. Substantially higher 
dosages are possible in topical administration. Actual methods for preparing parenterally 

30 administrable compositions will be known or apparent to those skilled in the art, e.g., 

Remington's Pharmaceutical Science and Goodman and Gilman, The Pharmacologial Basis 
of Therapeutics , supra. 
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The compositions containing modulators of lung cancer proteins can be administered 
for therapeutic or prophylactic treatments. In therapeutic applications, compositions are 
administered to a patient suffering from a disease (e.g., a cancer) in an amount sufficient to 
cure or at least partially arrest the disease and its complications. An amount adequate to 
5 accomplish this is defined as a "therapeutically effective dose." Amounts effective for this 
use will depend upon the severity of the disease and the general state of the patient's health. 
Single or multiple administrations of the compositions may be administered depending on the 
dosage and frequency as required and tolerated by the patient. In any event, the composition 
should provide a sufficient quantity of the agents of this invention to effectively treat the 

10 patient. An amount of modulator that is capable of preventing or slowing the development of 
cancer in a mammal is referred to as a "prophylactically effective dose." The particular dose 
required for a prophylactic treatment will depend upon the medical condition and history of 
the mammal, the particular cancer being prevented, as well as other factors such as age, 
weight, gender, administration route, efficiency, etc. Such prophylactic treatments may be 

1 5 used, e.g., in a mammal who has previously had cancer to prevent a recurrence of the cancer, 
or in a mammal who is suspected of having a significant likelihood of developing cancer 
based, at least in part, upon gene expression profiles. Vaccine strategies may be used, in 
either a DNA vaccine form, or protein vaccine. 

It will be appreciated that the present lung cancer protein-modulating compounds can 

20 be administered alone or in combination with additional lung cancer modulating compounds 
or with other therapeutic agent, e.g., other anti-cancer agents or treatments. 

In numerous embodiments, one or more nucleic acids, e.g., polynucleotides 
comprising nucleic acid sequences set forth in the tables, such as antisense or RNAi 
polynucleotides or ribozymes, will be introduced into cells, in vitro or in vivo. The present 

25 invention provides methods, reagents, vectors, and cells useful for expression of lung cancer- 
associated polypeptides and nucleic acids using in vitro (cell-free), ex vivo, or in vivo (cell or 
organism-based) recombinant expression systems. 

The particular procedure used to introduce the nucleic acids into a host cell for 
expression of a protein or nucleic acid is application specific. Many procedures for 

30 introducing foreign nucleotide sequences into host cells may be used. These include the use 
of calcium phosphate transfection, spheroplasts, electroporation, liposomes, microinjection, 
plasma vectors, viral vectors and other well known methods for introducing cloned genomic 
DNA, cDNA, synthetic DNA or other foreign genetic material into a host cell (see, e.g., 
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Berger and Kimmel, Guide to Molecular Cloning Techniques. Methods in Enzvmology 
volume 152 (Berger), Ausubel, et al. (eds. 1999) Current Protocols (supplemented through 
1999), and Sambrook, et al. (1989) Molecular Cloning - A Laboratory Manual (2nd ed., Vol. 
1-3). 

5 In a preferred embodiment, lung cancer proteins and modulators are administered as 

therapeutic agents, and can be formulated as outlined above. Similarly, lung cancer genes 
(including both the full-length sequence, partial sequences, or regulatory sequences of the 
lung cancer coding regions) can be administered in a gene therapy application. These lung 
cancer genes can include antisense or inhibitory applications, e.g., as inhibitory RNA or gene 

10 therapy (e.g., for incorporation into the genome) or as antisense compositions. 

Lung cancer polypeptides and polynucleotides can also be administered as vaccine 
compositions to stimulate HTL, CTL, and antibody responses.. Such vaccine compositions 
can include, e.g., lipidated peptides (see, e.g.,Vitiello, et al. (1995) J. Clin. Invest. 95:341), 
peptide compositions encapsulated in poly(DL-lactide-co-glycolide) ("PLG") microspheres 

15 (see, e.g., Eldridge, et al. (1991) Molec. Immunol. 28:287-294; Alonso, et al. (1994) Vaccine 
12:299-306; Jones, et al. (1995) Vaccine 13:675-681), peptide compositions contained in 
immune stimulating complexes (ISCOMS) (see, e.g., Takahashi, et al. (1990) Nature 
344:873-875; Hu, et al. (1998) Clin Exp Immunol. 113:235-243), multiple antigen peptide 
systems (MAPs) (see, e.g., Tarn (1988) Proc. Natl. Acad. Sci. U.S.A. 85:5409-5413; Tarn 

20 (1 996) J. Immunol. Methods 1 96: 1 7-32), peptides formulated as multivalent peptides; 

peptides for use in ballistic delivery systems, typically crystallized peptides, viral delivery 
vectors (Perkus, et al., p. 379 In: Kaufmann (ed. 1996) Concepts in vaccine development : 
Chakrabarti, et al. (1986) Nature 320:535; Hu, et al. (1986) Nature 320:537; Kieny, et al. 
(1986) AIDS Bio/Technology 4:790; Top, et al. (1971) J. Infect. Pis. 124:148; Chanda, et al. 

25 (1990) Virology 175:535), particles of viral or synthetic origin (see, e.g., Kofler, et al. (1996) 
J. Immunol. Methods 192:25; Eldridge, et al. (1993) Sem. Hematol. 30:16; Falo, et al. (1995) 
Nature Med. 7:649), adjuvants (Warren, et al. (1986) Annu. Rev. Immunol. 4:369; Gupta, et 
al. (1993) Vaccine 1 1:293), liposomes (Reddy, et al. (1992) J. Immunol. 148:1585; Rock 
(1996) Immunol. Today 17:131), or, naked or particle absorbed cDNA (Ulmer, et al. (1993) 

30 Science 259:1745; Robinson, et al. (1993) Vaccine 1 1:957; Shiver, et al., p. 423 In: 

Kaufmann (ed. 1996) Concepts in vaccine development ; Cease and Berzofsky (1994) Annu. 
Rev. Immunol. 12:923 and Eldridge, et al. (1993) Sem. Hematol. 30:16). Toxin-targeted 
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delivery technologies, also known as receptor mediated targeting, such as those of Avant 
hnmunotherapeutics, Inc. (Needham, Massachusetts) may also be used. 

Vaccine compositions often include adjuvants. Many adjuvants contain a substance 
designed to protect the antigen from rapid catabolism, such as aluminum hydroxide or 
5 mineral oil, and a stimulator of immune responses, such as lipid A, Bortadella pertussis or 
Mycobacterium tuberculosis derived proteins. Certain adjuvants are commercially available 
as, e.g., Freund's Incomplete Adjuvant and Complete Adjuvant (Difco Laboratories, Detroit, 
MI); Merck Adjuvant 65 (Merck and Company, Inc., Rahway, NJ); AS-2 (SmithKline 
Beecham, Philadelphia, PA); aluminum salts such as aluminum hydroxide gel (alum) or 
1 0 aluminum phosphate; salts of calcium, iron or zinc; an insoluble suspension of acylated 
tyrosine; acylated sugars; cationically or anionically derivatized polysaccharides; 
polyphosphazenes; biodegradable microspheres; monophosphoryl lipid A and quil A. 
Cytokines, such as GM-CSF, interleukin-2, -7, -12, and other like growth factors, may also be 
used as adjuvants. 

1 5 Vaccines can be administered as nucleic acid compositions wherein DNA or RNA 

encoding one or more of the polypeptides, or a fragment thereof, is administered to a patient. 
This approach is described, for instance, in Wolff, et. al. (1990) Science 247:1465 as well as 
U.S. Patent Nos. 5,580,859; 5,589,466; 5,804,566; 5,739,118; 5,736,524; 5,679,647; WO 
98/04720; and in more detail below. Examples of DNA-based delivery technologies include 

20 "naked DNA", facilitated (bupivicaine, polymers, peptide-mediated) delivery, cationic lipid 
complexes, and particle-mediated ("gene gun") or pressure-mediated delivery (see, e.g., U.S. 
Patent No. 5,922,687). 

For therapeutic or prophylactic immunization purposes, the peptides of the invention 
can be expressed by viral or bacterial vectors. Examples of expression vectors include 

25 attenuated viral hosts, such as vaccinia or fowlpox. This approach involves the use of 
vaccinia virus, e.g., as a vector to express nucleotide sequences that encode lung cancer 
polypeptides or polypeptide fragments. Upon introduction into a host, the recombinant 
vaccinia virus expresses the immunogenic peptide, and thereby elicits an immune response. 
Vaccinia vectors and methods useful in immunization protocols are described in, e.g., U.S. 

30 Patent No. 4,722,848. Another vector is BCG (Bacille Calmette Guerin). BCG vectors are 
described in Stover, et al. (1991) Nature 351:456-460. A wide variety of other vectors useful 
for therapeutic administration or immunization e.g., adeno and adeno-associated virus 
vectors, retroviral vectors, Salmonella typhi vectors, detoxified anthrax toxin vectors, and the 



79 



WO 02/086443 PCT/US02/12476 

like, will be apparent to those skilled in the art from the description herein (see, e.g., Shata, et 
al. (2000) Mol Med Today 6:66-71; Shedlock, et al. (2000) J. Leukoc. Biol. 68:793-806; 
Hipp, et al. (2000) In Vivo 14:571-85). 

Methods for the use of genes as DNA vaccines are well known, and include placing a 
5 lung cancer gene or portion of a lung cancer gene under the control of a regulatable promoter 
or a tissue-specific promoter for expression in a lung cancer patient. The lung cancer gene 
used for DNA vaccines can encode full-length lung cancer proteins, but more preferably 
encodes portions of the lung cancer proteins including peptides derived from the lung cancer 
protein. In one embodiment, a patient is immunized with a DNA vaccine comprising a 

10 plurality of nucleotide sequences derived from a lung cancer gene. For example, lung cancer- 
associated genes or sequence encoding sub fragments of a lung cancer protein are introduced 
into expression vectors and tested for their immunogenicity in the context of Class I MHC 
and an ability to generate cytotoxic T cell responses. This procedure provides for production 
of cytotoxic T cell responses against cells which present antigen, including intracellular 

15 epitopes. 

In a preferred embodiment, DNA vaccines include a gene encoding an adjuvant 
molecule with the DNA vaccine. Such adjuvant molecules include cytokines that increase 
the immunogenic response to the lung cancer polypeptide encoded by the DNA vaccine. 
Additional or alternative adjuvants are available. 

20 In another preferred embodiment lung cancer genes find use in generating animal 

models of lung cancer. When the lung cancer gene identified is repressed or diminished in 
metastatic tissue, gene therapy technology, e.g., wherein antisense or inhibitory RNA directed 
to the lung cancer gene will also diminish or repress expression of the gene. Animal models 
of lung cancer find use in screening for modulators of a lung cancer-associated sequence or 

25 modulators of lung cancer. Similarly, transgenic animal technology including gene knockout 
technology, e.g., as a result of homologous recombination with an appropriate gene targeting 
vector, will result in the absence or increased expression of the lung cancer protein. When 
desired, tissue-specific expression or knockout of the lung cancer protein may be necessary. 
It is also possible that the lung cancer protein is overexpressed in lung cancer. As 

30 such, transgenic animals can be generated that overexpress the lung cancer protein. 

Depending on the desired expression level, promoters of various strengths can be employed 
to express the transgene. Also, the number of copies of the integrated transgene can be 
determined and compared for a determination of the expression level of the transgene. 
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Animals generated by such methods will find use as animal models of lung cancer and are 
additionally useful in screening for modulators to treat lung cancer. 



Kits for Use in Diagnostic and/or Prognostic Applications 

5 For use in diagnostic, research, and therapeutic applications suggested above, kits are 

also provided by the invention. In diagnostic and research applications such kits may include 
at least one of the following: assay reagents, buffers, lung cancer-specific nucleic acids or 
antibodies, hybridization probes and/or primers, antisense polynucleotides, ribozymes, RNAi, 
dominant negative lung cancer polypeptides or polynucleotides, small molecule inhibitors of 

10 lung cancer-associated sequences, etc. A therapeutic product may include sterile saline or 
another pharmaceutically acceptable emulsion and suspension base. 

In addition, the kits may include instructional materials containing instructions (e.g., 
protocols) for the practice of the methods of this invention. While the instructional materials 
typically comprise written or printed materials they are not limited to such. A medium 

15 capable of storing such instructions and communicating them to an end user is contemplated 
by this invention. Such media include, but are not limited to electronic storage media (e.g., 
magnetic discs, tapes, cartridges, chips), optical media (e.g., CD ROM), and the like. Such 
media may include addresses to internet sites that provide such instructional materials. 
The present invention also provides for kits for screening for modulators of lung 

20 cancer- associated sequences. Such kits can be prepared from readily available materials and 
reagents. For example, such kits can comprise one or more of the following materials: a lung 
cancer-associated polypeptide or polynucleotide, reaction tubes, and instructions for testing 
lung cancer-associated activity. Optionally, the kit contains biologically active lung cancer 
protein. A wide variety of kits and components can be prepared according to the present 

25 invention, depending upon the intended user of the kit and the particular needs of the user. 
Diagnosis would typically involve evaluation of a plurality of genes or products. The genes 
typically will be selected based on correlations with important parameters in disease which 
may be identified in historical or outcome data. 



81 



WO 02/086443 



EXAMPLES 



PCT/US02/12476 



Example 1 : Gene Chip Analysis 

Molecular profiles of various normal and cancerous tissues were determined and 
analyzed using gene chips. RNA was isolated and gene chip analysis was performed as 
described (Glynne, et al. (2000) Nature 403:672-676; Zhao, et al. (2000) Genes Dev. 14:981- 
993). 
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Tables 1A and 1B were previously filed on April 18, 2001 in USSN 60/284,770 (18501-001SOOUS) and on November 29, 2001 in USSN 60/334,370 
(18501-001520US) 
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Homo sapiens mRNA; cDNA DKFZp564B2062 (f 
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Hs.22483 
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Hs.16355 












Hs.9218 










T57112 




™"yc20g1 1 .s1 Stratagene lung (#937210) 








T62979 


Hs.189813 








113540 
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Hs.1 8757 


ESTs 
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113552 


T90889 


Hs.1 6026 


ESTs 
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113606 


T93093 


lls.17125 


ESTs 


1.48 


0.7 


113695 


T96965 


Hs.17948 


ESTs 




0.28 


113946 


W84753 


Hs.37896 


ESTs 


179 


0.72 


114251 


Z39898 


Hs.21948 


ESTs 


1.95 


0.25 


114359 


Z41589 


Hs.1 53483 


ESTs; Moderately similar to H1 chloride 


1.42 


0.13 


115230 




Hs.1 82980 


ESTs 


2.62 




115279 


AA279760 


Hs.63671 


ESTs 


1.7S 


1191 


115566 


AA398083 




ESTs 


0.86 


0.2 


115965 


M446661 


Hs!l 73233 


ESTs 


0.7S 


0.04 


116166 


AA461556 


Hs.202949 


KIAA1 102 protein 
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0.68 


116279 


AA486073 


Hs.57362 


ESTs 
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H88157 


Hs.41105 


ESTs 
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Hs. 124292 




AA479209 


Hs.56340 


128789 


AA486567 
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AF014958 
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129210 
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Hs .202949 




W24360 


Hs .237868 
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Hs.98314 




AA447410 




129699 


AA458578 


Hs'l2017 
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N48596 


Hs.13256 
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lls.22588 
Hs.24950 
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Hs.29191 




AA157428 
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Hs.89485 






Hs.89640 






Hs.90421 




M21056 


Hs.992 




D00591 


Hs.84746 




D13666 


Hs.1 36348 


100280 


D42085 


Hs,155314 






Hs.6793 


100360 


D78335 




100372 


D79997 


Hs!l84339 


100486 


HG1112-HT1112 
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HG2197-HT2267 


100576 


HG2290-HT2386 
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HG2981-HT393 
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HG4716-HT5158 
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HG721-HT4827 





v-els avian erythroblastosis virus E26 o 
""yr30g11.s1 Soares fetal liver spleen 
advanced glycosylation end product-sped 
ESTs; Moderately similar to !!!! ALU SUB 
ESTs 

ESTs; Highly similar to KIAA0886 protein 



inhibitor of DNA binding 4; dt 



■"■yw37g07.s1 Morton Fetal Cochlea Homo 



■'"HUM145B09B Clontech human fetal brair. 
ESTs 

ESTs; Weakly similar to plL2 hypothetica 



ESTs 
ESTs 

chemokine (C-C motif) receptor-like 2 

ESTs; Highly similar to Rap2 interacting 

CDW52 antigen (CAMPATH-1 antigen) 

KIAA1102 protein 

interlaukin 7 receptor 

""yc21g01.s1 Stratagene lung (#937210) 

vasoactive intestinal peptide receptor 1 

Homo sapiens mRNA; cDNA DKFZp586L0120 (f 

ESTs; Weakly simil i to!!!! ALU SUBFAMI 

KIAA0439 protein; homolog of yeast ubiqu 

ESTs 



cystdne-rich protein 1 (intestinal) 
ESTs 

ESTs; Moderately similar to HYPOTHETICAL 



regulator of G-protein signalling 5 
epithelial membrane protein 2 
Grb2-associated binder 2 



Homo sapiens clone TUA8 Cri-du-chat regi 
slit (Drosophiia) homolog 3 
tetranectin (plasminogen-binding protein 
adipose specific 2 



Homo sapiens mRNA; cDNA DKFZp564M0763 (f 
transforming growth factor; beta recepto 
solute carrierfamily 35 (CMP-sialic aci 
adenosine deaminase; RNA-speciSc; B1 (h 
deleted in liver cancer 1 
ESTs 



TEK tyrosine kinase; endothelial (venous 
ESTs; Moderately similar to till ALU SUB 
phospholipase A2; group IB (pancreas) 
Chromosome condensation 1 
Homo sapiens mRNA for osteoblast specifi 
KIAA0095gene product 
platelet-activating factor acetylhydrala 
Uridine monophosphate kin 
KIAA0175gene product 



TIGS: ra 



iin TC4 



"collagen, type VII, alpha 1" 
"calcitonin/alpha-CGRP, alt. transcript 
"TIGR: CD44 (epican, alt. transcript 12 
Guanosine 5'-Monophosphate Synthase 
"TIGR: placental protein 14, endometrial 
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•Matrix metalloproteinase 9 (gelatinase 
Achaete-scute complex (Drosophila) homol 
"Protease inhibitor 3, skin-derived (SKA 
"Melanoma antigen, family A, 2" 



102669 U71207 
U74612 
U91618 
102888 X04741 



103036 X54925 

103058 X57348 

103060 X57766 

103119 X63629 

103206 X72755 

103242 X76342 

103312 X82693 

103478 Y07755 

103558 Z19574 



103594 Z31560 
103768 AA089997 
104158 AA454908 



104906 AA055809 

104978 AA088458 

105012 AA116036 

105175 AA186804 

105263 AA227926 

105298 AA233459 

105312 AA233854 

105719 AA291644 

105743 AA293300 

106012 . AA411621 

106231 AA429571 

106540 AA454607 

106575 AA456039 

106632 AA459897 

106727 AA465342 

106906 AA490237 

107059 AA608545 

107104 AA609786 

107151 AA621169 

107284 S74039 

107901 AA026418 

107922 AA028028 

107932 AA029317 

108695 AA121315 

108857 AA133250 
AA1 33334 

108990 AA1 52296 

109166 AA179845 

109424 AA227919 

F05012 

109970 H09281 



Hs.75517 
Hs.31 3 
Hs.90073 



Hs.1076 Small prdine-rich protein 1 B (cornilin) 

Hs.1 95850 keratin 5 (epidermolysis bullosa simplex 

Hs.267319 Endogenous retroviral protease 

Hs.220529 Carcinoembryonic antigen-related cell ad 

Hs.71642 "Guanine nucleotide binding protein (G p 

"Human parathyroid hormone-related pepti 

Hs.1590 Heparin-binding growth factor binding pr 

Hs.620 bullous pemphigoid antigen 1 (230/240kD) 

Hs.1 925 Desmoglein 3 (pemphigus vulgaris antigen 

Hs.1 84601 "Solute carrier family 7 (cationic amino 

Hs.169840 TTK protein kinase 

Hs.1 12408 S100 calcium-binding protein A7 (psorias 

"Homo sapiens connexin 25 (GJB2) mRNA : c 

Hs.78867 "Protein tyrosine phosphatase, receptor- 

Hs.82045 Midkine (neurite growth-promoting factor 

Hs.751 1 7 'InterleuMn enhancer binding factor 2, 
'Laminin, beta 3 (nicein (l25kD), kaliri 
secreted phosphoprotein 1 (osteoponfin; 
chromosome segregation 1 (yeast homolog) 

Hs.87b38 Aldehyde dehydrogenase 8 

Hs.77256 Enhancer of zeste (Drosophila) homolog 2 

Hs.30743 Preferentially expressed antigen in mela 

Hs.371 1 0 "Melanoma antigen, family A, 9 (MAGE-9)' 

Hs.29279 Eyes absent (Drosophila) homolog 2 

Hs.239 ForkheadboxMI 

Hs.80962 Neurotensin 

Hs.761 1 8 Ubiquitin carboxyl-terminal esterase L1 

Hs.80342 keratin 15 

Hs.2258 Matrix Metalloproteinase 1 0 (Stromolysin 

Hs.37058 - 1 
Hs.85266 
Hs.83169 
Hs.1 84510 
Hs.1 55324 

Hs.2877 "Cadherin 3, P-cadherin (placental)" 

Hs.77357 monokine induced by gamma interferon 

Hs.389 "Alcohol dehydrogenase 7 (class IV), mu 

Hs.31 85 "Lymphocyte antigen 6 complex, locus D; 

lb.38931 S100 calcium-binding protein A2 

Hs.2785 keratin 17 

Hs.2631 Desmoglein 2 

l ls.82128 5T4 Oncofetal antigen 

Hs.81 6 "SRY (sex determining region Yj-aox 2, p 
"ESTs, Highly similar to integral membra 

Hs.8127 KIAA01 44 gene product 

• - -'-ie9B7N21cn 



Hs.23071 

Hs.26802 

Hs.19322 

Hs.9329 

Hs.25740 

Hs.6682 

Hs.26369 

Hs.23348 

Hs.36793 

Hs.9598 

Hs.8895 

Hs.38002 

Hs.38114 

Hs.105421 

Hs.1 1950 

Hs.34045 

Hs.222024 

Hs.23044 

Hs.15243 

Hs.8687 

Hs.291904 

Hs.91539 

Hs.61460 



ESTs; Weakly similar to till ALU SUBFAMI 
"Homo sapiens mRNAfor fls353, complete 
ESTs; Weakly similar to unknown [S.cerev 



ESTs; same as BFH6? 
KIAA1 355 protein 
Hypothetical protein FU11100 
ESTs 

GPI-anchored metastasis-associated prote 
Hypothetical protein FU20764 
Transcription factor BMAL2 (cycle-like f 
FIAD51 (S. cerevislae) homolog (E coli Re 
Nucleolar protein 1 (120kD) 
ESTs; procollagen l-N proteinase 
Accessory proteins BAP31/BAP29 
ESTs 

Ig iperfamlly receptor LNIR precursor 
Hypothetical protein FU21620 
KIAA1077 protein 
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HSPC1 50 protein similar to ubiquitin-con 


0^91 


3,18 








ESTs; Weakly similar to neogenin [H.sapi 




3,13 






Hs. 14559 


Hypothetical protein FU10540 




125 






Hs, 293246 


E' T Weakl similar lo putative p1 50 [ 






111902 




Hs.109445 


KIAA1020 protein 










Hs.70823 


KIAA1077 protein 


077 


3.01 








"cDNA FLJ13308 fis, clone OVARC1001 435, 








T23482 


Hs.89981 


"Diacylglycerol kinase, zeta (104kD)" 














0,87 


2 


















Hs.16740 


Hypothetical protein FLJ11036 








W86748 








1.73 








"ATPase, aminophospholipid transported 


0'86 


0.82 


114407 


M010188 


Hs!l 03305 


ESTs 


0,8 


1,88 


114471 


AA028074 


Hs.104613 


RP42 homolog 




1,34 




M043551 


Hs.1 01799 


KIAA1350 protein 


182 


2.32 






Hs.198249 


"Gap junction protein, beta 5 (connexin 


0.79 


1.49 




AA255900 


Hs. 184523 


KIAA0965 protein 


0,72 


192 




M256642 


Hs.236894 


"ESTs, High sim to LRP1_hu low density 1 


0.59 


1,97 




AA279943 


Hs. 122579 






125 




AA292537 


Hs.45207 


Hypothetical protein KIAA1335 


1,15 
0.5 


1,48 
3,29 
1 




M331393 
AA347193 


Hs.S2180 
















6.53 




M436666 


Hs.59761 




1 


6.98 




AA447522 


Hs.S9517 










AA4521 12 


Hs.42644 




0.99 






M456968 


Hs.92030 




1.14 


18 




AA460246 


Hs.50441 










AA461063 






0.99 


1.9 




AA461187 


Hs.61762 




0.44 


0.86 




AA495830 


Hs.87013 


"Homo sapiens cDNA FLJ10238 f:s, clone H 


0.62 


3.89 










1.04 


2.36 




N23239 


Hs.21 1092 






0.64 






















0.98 


179 












1,43 






Hs.48956 






2,86 






Hs42824 


Hypothetical protein FLJ10718 


121 


0,83 








KIAA1 199 see CVA7.doc 
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Hs/191381 


ESTs; Weakly similar to hypothetical pro 






















Hs. 132927 


"ESTs, Moderately simHar to p53 regulat 


1 








Hs 180479 








AA253400 




Tumor protein 63 kDa with strong homolog 
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12.05 
















AA360240 


Hs.97019 










AA397822 


Hs. 104650 


Hypothetical protein FU10292 


1.04 


2.15 




AA398209 










121362 


AA405500 


HS37932 


Chondromoduiin I precursor 








AA 405657 


Hs.1 28791 


CGI-09 protein 




1.8 




AA423978 


Hs.293317 


"ESTs, Weakly similar to JM27 [H.sapiens 








M479726 


Hs.105577 










AA481549 




B-cell CLL/lymphoma 1 1A (zinc finger pro 


0.95 


1.88 




AA488687 


Hs.284235 






4.98 






Hs, 135056 










AA608956 




"ESTs, Weakly similar to PQ0109 Purkinje 


1.03 


2.2 






Hs.1 12208 








D60302 


Hs.1 08977 






4.85 






Hs.99769 






8.52 


124960 




Hs.1 94766 












Hs.1 10024 














"Melanoma antigen, family A, 10" 


0.8 


142 




AA425587 






152 


2.26 




AA434562 














Hs.270799 






195 




N7Q192 


Hs.27B956 


Hypothetical protein FU 12929 












STEAP1 (Homo sapiens BAC clone RG041 D1 1 








AI354332 


Hs.72365 




0.73 


3^27 






Hs. 179729 


collagen; type X; alpha 1 (Schmid metaph 




1.94 
























0.97 




U46006 


Hs.1 0526 


Cysteine and glycine-rich protein 2 












Plakophilin 3 










Hs. 169902 


"Solute earner family 2 (facilitated gl 




2^04 


129099 


H50398 


Hs!l08660 


"ATP-bindlng cassette, sub-family C (CFT 


0.37 


1.04 


129404 




Hs.1 11128 


ESTs 






129466 


U2583 




"Genbank Homo sapiens keratin 6 isoform 


0.72 


12.57 


129605 


S72493 




Keratin 16 (focal non-epidermolytic palm 


0.92 


1.5 


129628 


U26727 


Hs!l174 47 


"Cyclin-dependent kinase inhibitor 2A (ra 


0.35 


1.93 


130023 


X13461 


Hs.239600 


Calmodulln-llke 3 


0.34 


1.22 


130080 


X14850 


Hs.147097 


"H2A histone family, member X" 


0.98 


1.95 


130385 


AA1 26474 


Hs.155223 


stanniocalcin 2 







PCT/US02/12476 



86 



WO 02/086443 

130410 V01514 Hs.1! 

130441 U35835 Hs.3l 

130482 L32866 Hs.1! 

130553 AA430032 Hs.2: 

130577 M35410 



130800 AA223386 

130939 AA598689 

131046 X02530 

131244 D38076 

131877 J04088 

131927 AA461549 

131965 W90146 

131978 D80008 

132354 L05187 

132543 AA417152 

132632 N59764 

132653 U31201 

132659 Z75190 

132710 W93726 

132758 W52432 

132767 L05188 

132816 M74542 

132990 AA458761 

133070 U69611 

133282 U52960 

133317 AA215299 

133370 AA156897 

133391 X57579 

133832 H03387 

134032 Z81326 

134168 AA398908 

134218 M227480 

134405 R67275 

134453 X70683 

134470 X54942 

134645 U87459 

134781 M17183 

135002 U19147 

100040 M97935 

101201 L22524 

101664 M60752 



102391 U41668 

103000 X51956 

103395 X94754 

105638 AA281599 

105726 M292328 

114841 M234722 

115206 AA262491 

115906 AA436616 

119132 R49046 

124163 H30539 

126487 AA482505 

127141 AA307960 

128034 M905754 

128609 AA234365 

128895 R37753 

130199 Z48579 
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100287 D43950 
100297 D49489 
100330 D55716 
100355 D78129 
100364 D78586 
100368 D79987 
100398 D84557 
100438 D87448 
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Hs.1 9574 
Hs.21400 
Hs.2248 
Hs.24763 
Hs.1 56346 
Hs.34780 



Hs.56105 

lls.231622 

Hs.575 

Hs.18387 

Hs.64311 

Hs.286145 

Hs.70830 

Hs.72157 

Hs.727 

Hs.241305 

Hs.78589 

Hs.181634 



Hs.1 67379 
Hs.89626 
Hs.272484 

Hs.2256 
Hs.1 2101 7 
Hs.78934 
Hs.2156 



Alpha-fetoprotein 

"Human DNA-PK mRNA, partial cds" 
Baculoviral IAP repeat-containing 5 (sur 
Pituitary tumor-transforming 1 
Insulin-like growth factor binding prate 
Matrix metalloproteinase 12 (macrophage 
ESTs; Weakly similar to katanin p80 subu 
ESTs 

INTERFERON-GAMMA INDUCED PROTEIN PF 
RAN binding protein 1 
Topoisomerase (DNA) II alpha(170kD) 
"Doublecortex; lissencephaly, X-linked ( 
ESTs 

KIAA0186 gene product 

Small proline-rich protein 1A 

ESTs; Highly similar to protein regulati 



"laminin gamma2 chain gene (LAMC2), exon 
"Low density lipoprotein receptor-relate 
"Serine (or cysteine) proteinase inhibit 
"ESTs, Weakly similar to WDNM RAT WDNM1 
Small proline-rich protein 2B 
Aldehyde dehydrogenase 3 
transcription factor AP-2 alpha (activat 
"Adisintegrin and metalloproteinase dom 
"SRB7 (suppressor of RNA polymerase B, y 
U6 snRNA-assocated Sm-'ike protein LSm7 
Homo sapiens mRNA; cDNA DKF2p564l1922 
H.sapiens acllvin beta-A subunit (exon 2 
estrogen-responsive B box protein (EBBP) 
"Serine (or cysteine) proteinase inhibit 
"Homo sapiens cDNA: FLJ23602 fis, clone 
Pim-2 oncogene 
"""collagen, type XI, alpha 1™ 
SRY (sex determining region Y)-box 4 
CDC28 protein kinase 2 
"Cancer/testis antigen (NY-ESO-1, CTAG1, 
Parathyroid hormone-like hormone 
G antigen 6 
AFFX control: STAT1 
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Hs.184601 
Hs.75478 



Hs.1 59234 
Hs.62402 
Hs.75426 
Hs.93597 



Hs.61153 
Hs.81892 
Hs.77329 



H2Ahistone family; member A 
mutS (E. coli) homotog 2 (colon ca 
RAR-related orphan receptor A 



phosphogluconate dehydrogenase 
cyclin-dependent kinase 4 
deoxyguanosine kinase 
' se 2; (gamma; neuronal) 



Hs.1 86572 ESTs 



Homo sapiens mRNA for for histone H2B; c 
activating transcription factor 5 
ESTs; Moderately similar to CALCIUM-DEPE 



ATP-blndlng cassette; sub-family B (MDPJ 
ESTs 

solute carrier family 7 (cationic amino 
KIAA0956 protein 

tyrosine 3-monooxygenase/tryptophar. 5 -mo 
survival of motor neuron protein interac 
ESTs 

a disintegrin and metalloprotease domain 
forkhead box E1 

p21/Cdc42/Rad-aclivated kinase 1 (yeast 
secretogranin II (chramogranin C) 
ESTs 

AFFX control: 28S ribosomal RNA 
thymidylate synthetaso 
proteasome (prosome; macropain) 25S subu 
KIAA0101 gene product 
phosphatldylserine synthase 1 



aldo-keto reductase family 1 ; member C3 
minichromosome maintenance deioeM (S. 
proteasome (prosome; macropain) subunit; 
"""Human mRNA for annexin II, 5'UTR (seq 
chaperonin containing TCPT, subunit 5 (e 
protein disulf 



Hs.1 54868 
Hs.1 53479 
Hs.1 55462 
Hs.91417 



"""Homo sapiens mRNA for squalene epoxid 
carbamoyl-phosphate synthetase 2: aspart 
extra spindle poles; S. cerevisiae; homo 
minichromosome maintenance deficient (mi 
topoisomerase (DNA) II binding protein 
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N-myc downstream regulated 
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Hs.1 57460 
Hs.1 74203 
Hs.73798 
Hs.795 
Hs.84113 
Hs.82916 
Hs.878 
Hs.78802. 

Hs.1 82018 
Hs.78996 
Hs.89839 
Hs.1'066 
Hs.75692 
Hs.12153 
Hs.251669 
Hs.1 244 

Hs.79217 



ne-rich 3 



solute carrier family 1 (glutamate/neutr 
macrophage migration Inhibitory factor ( 
H2Ahistone family; member O 
cycHn-dependent kinase inhibitor 3 (CDK 
chaperonin containing TCP1 ; subunit 6A ( 

glycogen synthase kinase 3 beta 
""Homo sapiens (cell line HL-6) alpha t 
interleukin-1 receptor-associated kinase 



EphA1 



asparagine synthetase 
eukaryotic translation initiation factor 
casein kinase 2; beta polypeptide 
CD9 antigen (p24) 

"■"Human alpha-1 collagen type I gene, 3 
pyrroline-5-carboxylate reductase 1 
membrane component; chromosomal 4; si 



3.1 18400 singed (Drosophila)-like (sea urc 



Hs.201967 
Hs.1 594 
Hs.1575 
Hs.75823 
Hs,2437 



>er C1 



centromere protein A (17kD) 
small nuclear ribonucleoprotein D3 polyp 
ALL1-fused gene from chromosome 1q 
eukaryotic translation initiation factor 
lysyl oxidase-like 1 

karyopherin alpha 2 (RAG cohort 1 ; Impor 
'--'(DrosophilaHPIbeta 




Hs.12045 
Hs.93002 
Hs.54089 



BRCA1 as 

"■"liumanlllV-INefinti 
transcription factor AP-2 gamma (actival 
chaperonin containing TCP1; subunit 2 (b 



Hs.2704 

Hs.74368 

Hs.2393 

Hs.1 708 

Hs.3155 

Hs.2041 33 

Hs.77495 

Hs.75854 

Hs.54416 

Hs.1 14356 

Hs.78596 

Hs.82254 

Hs.204238 

Hs.194657 

Hs.2340 

Hs.172928 



Hs.11801 
Hs.247280 
Hs.34744 



1; protein (NM23A) 
multifunctional polypeptide similar to S 
CDC28 protein kinase 1 
se M1 polypeptide 



.in; alpha 
hexabrachion (tenascin C; cytotactin) 
small nuclear ribonucleoprotein polypept 
SULTICsuifolransferase 
sine oculis homeobox (Drosophila) homolo 
pyrraline-5-carboxylate synthetase (glut 
proteasome (prosome; rnacropain) subunit, 
M-phasephosphoprotein11 
lipocalin 2 (oncogene 24p3) 
cadherin 1; E-cadherin (epithelial) 
junction plakoglobin 
collagen; type I; alpha 1 
ESTs; Weakly similar to R07G3.8 [C.elega 
RNA polymerase I subunit 
ESTs; Weakly similar to R27090J [H.sapi 
KIAA0956 protein 

collagen; type VII; alpha 1 (epidermolys 
cystatln SN 

ribulose-5-phosphate-3-epimerase 
ESTs; Weakly similar to ACYL-COA DEHYDRO 
adenosine A2b receptor pseudogene 
HBV associated factor 



88 



PCT/US02/12476 



105705 M290767 



105936 AA404338 

106069 AA417741 

106103 M421104 

106140 AA424524 

106149 AA424881 

106154 M425304 

106182 AA426609 

106220 AA428582 

106228 M429290 

106318 AA436570 

106341 AA441798 

106432 M448850 

106474 AA450212 

106483 AA451676 

106599 M457235 

106611 AA458904 

106654 AA460449 

107076 M609145 

107115 AA610108 

107129 AA620553 

107159 M621340 

107444 W28391 

107481 W58247 

107516 X56597 

107529 Y12065 

107531 Y13936 

107801 AA019433 

107957 AA031948 

108565 AA085342 

108780 AA128561 

108828 AA131584 

109060 AA160879 

109112 AA169379 

109344 M213696 

109412 AA227145 

110780 N23174 

110958 N50550 

111018 N54067 

111337 N79612 

112305 R54822 

112401 R61279 

112853 T02843 



114587 AA070827 

114846 AA234929 

114964 AA243873 

115047 M252627 

115166 AA258409 

115167 AA258421 
115239 AA278650 
115278 AA279757 
115652 AA405098 
115875 AA433943 
116004 AA449122 
116121 AA459254 
116129 AA459956 



117950 N51394 

117992 N52000 

118785 N75386 

119717 W69134 

119814 W74069 

120128 Z38499 

120242 Z98443 



Hs.8645 

Hs.21214 

Hs.6375 

Hs.15202 

Hs.101282 

Hs.22934 

Hs.21580 

Hs.24743 

Hs.16869 

Hs.26662 



Homo sapiens mRNA; cDNA DKFZp564K0222 (f 
ESTs; Weakl/ similar to ollgodendrccyte- 
Homo sapiens mRNA; cDNA DKFZp434B1 02 (ft 
ESTs; Weakly sim'lar to ZINC FINGER PROT 



Hs.256301 ESTs 



Hs.17719 
Hs.9605 
Hs.5243 
Hs.17138 
Hs.42484 
Hs.30299 



ESTs; Weakly similar to ZINC FINGER PROT 
ESTs 

KIAA0286 protein 



Hs.5181 
Hs.27437 
Hs.99853 ■ 
Hs.5092 
Hs.17883 
Hs.173100 
Hs.57548 
Hs.1526 
Hs.117938 
Hs.71435 
Hs.241651 
Hs.72865 



ESTs 

Homo sapiens mRNA; cDNA DKFZp564C053 (ft 
IGF-II mRNA-binding protein 2 
ESTs; Moderately similar to non-function 
ESTs; Weakly similar to torsinA [H.sapie 
ESTs; Highly similar to phosphoserine am 
ESTs; Weakly similar to fos39554_1 [H.sa 
ESTs; Highly similar to CGI-124 protein 
flap structure-specific endonuclease 1 
ESTs; Weakly similar to ORF YKR081C [S.c 
proliferation-associated 2G4; 3SkD 
Homo sapiens kinesin superfamlly motor K 



ATPase; Ca++ transporting; cardiac muscl 
collagen; type XVII; alpha 1 
DKFZP564O0463 protein 
chloride channel; calcium activated; fam 
ESTs 

poly(A)-binding protein-like 1 
ESTs; Weakly similar lo REGULATOR OF Mr 
solute carrier family 1 (cationic amino 
signal transduction protein (SH3 contain 
mitogen-actvated protein kinase kinase 
ESTs; Highly similar to Myosin heavy cha 
ESTs 

ESTs; Weakly similar to F25B5.3 [Celega 



ls.184008 ESTs; Weakly similar to RNA-binding prat 
Hs.5027 ESTs 

Hs.152571 ESTs; Highly similar to IGF-II mRNA-bind 



Hs.44343 
Hs.82184 
Hs,22554 
Hs,198907 
Hs.43728 
Hs.73291 
Hs.67466 
Hs,38178 
Hs.43946 
Hs,76086 
Hs,48855 
Hs.49163 
Hs.67776 
Hs.65403 



ring finger protein 3 

hcmeo box B5 

myelin protein zero-like 1 



Hs.111867 
Hs.57987 
Hs,58350 
Hs,91448 

Hs.86366 ESTs 



ESTs; Weakly similar to GOLIATH PROTEIN 
KIAA0956 protein 

Homo sapiens mRNA; cDNA DKFZp586B0222 (f 
GLI-Kruppel family member GLI2 



MKP-1 like protein tyrosine phosphatase 
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WO 02/086443 

120483 AA252994 Hs.1578 
121054 AA398604 
AA404246 
121376 AA405699 



21457 , 
:21780 AA422086 
121781 AA422150 
AA425732 
AA431737 
AA443311 
122354 AA443772 
AA453265 
122790 AA460156 
123398 M521265 
AA608531 
AA609471 
124000 D57317 
124367 N24006 
124447 N48000 
125756 W25498 
125769 AI382972 
125852 H09290 
AA526849 
M85772 
126214 N29455 
126414 N78770 
126737 AA488132 
126743 AA179253 
— AA179546 
127432 AA501734 
128218 H02682 
128527 M31523 
128568 X60673 
128584 M11433 
128628 C14037 
128691 W27939 
128714 V00599 
128733 AA328993 
128781 X85372 
129052 AA496297 
129095 L12350 
129241 AA435665 



129896 AA043021 



31028 U20240 

31083 U66661 

31091 T35341 

31144 C14412 

31148 C00038 

31164 Y00503 

31185 M25753 

131219 C00476 

131454 M455896 

131687 L11066 

131689 AA599653 

131692 D50914 

131786 AA135554 

131843 AA195893 

131860 U02082 

131884 H90124 

131903 AA481723 

131945 MB7339 

131958 AA093998 

131964 W42508 
J00277 

132040 AA146843 

132065 D82226 

132109 AA599801 

132112 AA150661 

132123 AA447123 

132162 H89551 

132180 AA405569 

132309 AA460917 

132371 M235448 

132618 AA253330 

132736 U68019 



Hs.97387 ESTs 



apoptosis inhibitor 4 (survivin) 



PCT/US02/12476 



Hs.166232 
Hs.208985 
Hs.124660 
Hs.98370 
Hs.98485 



Hs.186692 
Hs.99311 

Hsll05514 

Hs.170313 

Hs.112712 

Hs.74861 

Hs.99348 

Hs.140945 

Hs.81634 

Hs.82128 

Hs.76550 

Hs.82109 

Hs.6066 

Hs.74316 

Hs.223439 

Hs.62741 

Hs.172182 



Homo sapiens mRNA; cDNA DKFZp586L1 41 (fr 
ATP synthase; H+ transporting; mitocliond 
5T4 oncofetal trophoblast glycoprotein 
Homo sapiens mRNA; cDNA DKFZp5S491 264 {f 
syndecan 1 
KIAA1 112 protein 
desmoplakln (DPI; DPII) 
ESTs 



Hs.247568 

Hs.101850 

Hs.251978 

Hs.103834 

Hs.179661 

Hs.104558 

Hs.105465 

Hs.182740 

Hs.108623 

Hs.109706 

Hs.1 18778 

Hs.179999 

Hs.12152 

Hs.56845 

Hs.13225 

Hs.146428 

Hs.155396 

Hs.211584 

Hs.174070 

Hs.2058 

Hs.22142 

Hs.2227 

Hs.22785 



retinol-binding protein 1; cellular 

EST 

ESTs 

Homo sapiens clone 24703 beta-tubuiin mR 



ribosomal protein S11 



ESTs 

ESTs; Moderately similar to SIGNAL RECOG 
GDP dissociation inhibitor 2 
UDP-Gal:betaGlcNAc beta 1;4- galactosylt 
collagen; type V; alpha 1 
nuclear factor (erythroid-derived 2)-!ik 



ubiquitin carrier protein 



biTs Weakly s milar to NADH-CYTOCHROME 
CCAAT/enhancer binding protein (C/EBP); 
gamma-aminobutyric acid (GABA) A receplo 
ESTs; Highly similar to dipeptfdyt peptt 
ESTs; Highly similar to HSPC038 protein 



Hs.24395 small inducible cytokine subfamily B (Cy 



Hs.3069 
Hs.30696 
Hs.30736 
Hs.32125 
Hs.184062 



Hs.3436 

Hs.35120 

Hs.3566 

Hs.3593 

Hs.37003 

Hs.172894 

Hs.211594 



heat shock 70kD protein 9B (mortalin-2) 
transcription factor-like 5 (basic helix 
KIAA01 24 protein 
ESTs 

ESTs; Moderately similar to putative Rab 

Oncogene TIM 

ribosomal protein S23 

deleted in oral cancer (mouse; homolog) 

replication factor C (activator 1 ) 4 (37 

ESTs; Highly similar to phosphorylation 

ESTs 

v-Ha-ras Harvey rat sarcoma viral oncoge 
BH3 nlcracling domain death agonist 
proteasome (prosome; macropair.) 26S subu 
ESTs 

jumonji (mouse) homolog 



Hs.41241 ESTs 



Hs.2780 
Hs,46677 
Hs.5344 
Hs.211578 



fibroblast activation protein; alpha; se 
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20 



WO 02/086443 

132771 AA488432 Hs.56407 
132833 U78525 Hs.57783 
132922 T23641 
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132994 AA505133 

133005 C21400 

133065 X62535 

133083 N70633 

133086 L17131 

133134 T89703 

133195 AA350744 

133313 AA249427 



133540 D78151 

133594 L07758 

133627 U09587 

133671 T25747 

133859 U86782 

133865 F09315 

133913 W84712 

133963 L34587 

133982 U47621 

134100 L07540 

134110 U41060 

134158 U15174 

134161 U97188 

134193 F09570 

134367 X54199 

134402 U25165 

134457 D86963 

134469 X17567 

134498 M63180 

134501 W84870 

134507 M63488 

134548 U41515 

134599 X99226 

134692 R73567 

134693 N70361 
134806 Z49099 
134821 Z34974 
134864 Y08999 
134914 U29615 
134953 L10678 
134993 AA282343 
135051 C15324 
135158 U51711 



Hs.61472 
Hs.7594 
Hs.103329 



Hs.181409 
Hs.70704 
Hs.1 58675 
Hs.73722 
Hs.73797 
Hs.74070 
Hs.74137 
Hs.74316 
Hs.74471 
Hs.74619 
Hs.172539 
Hs.75280 
Hs.75471 



ESTs; Weakly similar to unknown [S.cerev 
solute carrier family 2 (facilitated glu 
KIAA0970 protein 
diacylglycerol kinase; alpha (80kD) 




desmoplakin (DPI; DPI!) 
gapjunction protein; alpha 1; 43kD (con 
proteasome (prosome; macropan) 26S subu 
'i similar to S.cer 



glycyl-tRNA synthetase 
zinc finger protein 146 
26S proteasome-associated pad1 homol 
Hs.1 70290 discs; large (Drosophila) homolog 5 
Hs.7753 calumenin 

Hs.184693 transcription elongation factor B (Sill) 
Hs.207251 nucleolar autoantigen (55kD) similar to 
Hs.171 075 replication factor C (activator 1) 5 (36 
Hs.791 36 LIV-1 protein; estrogen regulated 



Hs7980 

Hs.82285 

Hs.82712 

Hs.174044 

Hs.83753 

Hs.84131 

Hs.211568 

Hs.84318 

Hs.85215 

Hs.86297 

Hs.8850 



ESTs 

phosphoribosylglycinamideformyltransfer 
fragile X mental retardation; autosomal 
dishevelled 3 (homologous to Drosophila 
small nuclear ribonucleoprotein polypept 
threonyl-tRNA synthetase 
eukaryotio translation initiation factor 
replication protein A1 (70kD) 
Deleted In split-hand/split-fool 1 regio 
Fanconi anemia; complementation group A 
a disintegrin and metalloproteinase doma 



ESTs 

Hs.89718 spermine synthase 

plakophilin 1 (ectodermal dysplasia/skin 
actin related protein 2/3 complex; subun 
chifnaso 1 (ch totriosidase) 



Hs.90370 
Hs.91093 
Hs.91747 
Hs.9242 
Hs.93668 



purine-rich element binding protein B 
ESTs 

Human desmocollin-2 mRNA; 3' UTR 



Table 1B shows the accession numbers for those pkeys in Table 1A lacking unigenolD's. For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from Genbank ESTs and mRNAs. These sequences were clustered based on sequence 

'in ) ( j Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers for sequences comprising each cluster are listed in the 
Accession column. 



100661 23182 1 BE623001 L05096 M383604 AW966416 N53295 AA460213 AW571519 AA603655 

100667 26401" 3 L05424 X56794 S66400 X55150 W60071 AW351 820 X55938 M83326 BE005289 BE070059 M83324 BE005248 BE069717 BE181648 BE069700 

AW606203 BE069721 AW382138 AW803776 BE463954 BE005334 BE005274 T27386 AA932714 AA9726S5 AW377728 AI632506 T29055 
AI783934 AW377727 BE163715 AL047291 M279047 AA523003 BE008048 BE440141 W23S14 BE090519 BE092193 N29181 N20358 N44153 
BE546944 T69231 AW377441 AA9O7406 H50799 AW051 416 AI420712 BE620922 AI279161 M992549 W47198 BE005241 AI342696 H50700 
AI969974 AI863855 AA374490 AW130675 AI950633 AA1 46687 H99482 X55150BE005414 BE005339 N2B294 AI673068 AI8S7890 AW8041 71 
AI675961 AW804172 AA778841 AL048050 AI127757 A1095568 AW204965 AW468978 W31898 AI052595 AI278771 BE46401 8 AI081 503 AI824196 
AA51 321 1 AA41 1062 AW084376 N48752 AA7032C9 N35580 AW05991 8 AA054563 AI280942 T27619 BE621 435 N6601 0 AW589527 AI1 60414 
AA283090 AA962536 H82726 W521 1 5 W45432 W60433 AA577548 AA146714 BE150994 AA054615 AW796025 AW382768 BE565671 C00444 
AA054555 

100668 26401J L05424 X56794 S66400 X55150 W60071 AW351820 X55938 M83326 BE005289 BE070059 M83324 BE005248 BE069717 BE181648 BE069700 

AW606203BE069721 AW362138 AW803775 BE463954 BE0053S4 BE005274 T27386 AA932714 AA97269S AW377728 AI632506 T290S6 
AI783934 AW377727 BE163715 AL047291 AA279047 AA523003 BE008043 BE440141 W23614 BE090519 BE092193 N29181 N20358 N44153 
BE546944T69231 AW377441 M907406 H50799 AW051 41 6 A142071 2 BE620922 A1279161 AA992549 W47198 BE005241 AI342696 H50700 
AI969974 AI863855 AA374490 AW130675 AI950633AA 146687 H99482 X55150 BE005414 BE005339 N28294 A1573068 AI887890 AW804171 
AI675961 AW8041 72 AA778841 AL048050 AM 27757 AI095568 AW204965 AW458978 W31 898 AI052595 AI278771 BE46401 8 AI081503 AI824196 
M513211M411062AW084376 N48752AA703209N35580AW059918AA054563AI280942T27619BE621435 N66010AW589527AI160414 
M283090 M962536H82726W52115W45432 W60433AA577548M146714BE150994 M054615AW796025AW382768 BE565671 C00444 
AA054555 

101332 25130J J04088 NM_001067 AF071747 AJ01 1741 N85424 AL042407 A^218572 BE296748 BE083981 AL040877 AW499918 AW675045 H1 7813 
BE081283 AA670403 AW504327 BE094229 AA1 04024 AI471 432 AI970337 AA73761 6 AI827444 AW003286 AI742333 AI344044 AI765634 
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AI948838 AW235336 AW1 72827 AA0952S9 BE046383 AI734240 W1 6699 A!660329 AI289433 AA933778 AW469242 AA468838 AA806983 
M625873 W78031 BE206307 AA550803 AI743147 AI990075 AA948274 AA129533 AI635399 AA605313 AI624559 AW594319 AI221834 AI337434 
AA307706 BE550232 A1760467 M63CE 4- 221521 AW6 1 W071 9 Al S33732 AI686969 A! 1 86928 AW074595A1 127486 AL079644 
AI910815 H17814 AA310903 AW137854T19279 AA026682 AA3C6035 AW383390 AW383389 AW383422 AW383427 AW383395 H09977 
M306247 AA352501 AW403639 F05421 AA224473 AA305321 H93904AA089612AW391543AW402915AW173382AW402701 AW403113 
R94438 N73125 H93466 AA090928 AA095051 T29025 AW951071 L47277 L47276 AI375913 BE384156 W24652 AA746288 AA568223 BE090591 
H93033 N57027 AA504348 AA327653 AW959913 N53757 AA843715 AI453437 AW263710 AI076594 M583483 AW873194 AW575166 AI128799 
AI80331 9 AL042776 AW074313 AI887722 AI032284 AA447521 AI123885 N29334 AI35491 1 AW090687 AA236763 AA435535 AA236910 
AA0471 24 AA236734 AW514610 H93467 M962007 AI446783 AA1 27269 AI613495 AI686720 AI687374 AA936731 AA702453 AI859757 
AA21 6786 AI251819 AI469227 AA806022 AI092324 N7 1668 AA968782 AA23691 9 AA809450 M227220 AA765284 AI192007 AA76881 0 
AA805794AA729280AA806238AW768817 N71879AI050686AA505822AA668974AI688160 BE045915 AW466315AA731314AA649668 
AA834316 AW591901 AW063876 AW294770 AI300266 AI336094 AI5603B0 M721766 H09978 D20305 D29166 AW821790 BE150864 F01675 
AI457474 AW46631 6 AA550969 AA630788 

100780 468 127 BE561958 BE561728 BE397612 BE514391 BE269037 BE514207 BE562381 BE514256 BE514403 BE514250 BE397832 BE269598 BE559865 
BE396881 BE560031 BE514199 BE560037 BE560454 

100830 4002 1 AC004770W05005AA356068AA094281 H29358T56781 AW875313 L37374 BE31 2466 BE31 1755 BE207106 BE293320 BE018115 AW239090 
BE548830 AW247547 M776062 BE397382 AA48671 3 T1 01 ! 1 T09340 AW498981 BE547280 AA356003 AW581520 AW875331 AA580720 
AW875336 BE276873 BE408229 AW1 881 48 BE255 1 66 BE253761 AW793727 AW3731 41 AW581 548 AA471 223 AA305950 BE263976 AA626820 
BE257409 AW360962 AA090655 C0031 2 BE31 2741 BE40721 3 AA209352 AW298199 AW248553 AW297794 AW731722 BE300586 AW731972 
AW615446 BE301599 AW615520 AA48671 4 AW440257 AA19651 6 AA564630 AA61 8079 AW192592 AW474985 AA604580 AI627461 AA765440 
AI680394 AL135548 AI683224 AI581 126 AW245096 AW1941 54 H29274 N70363 AA629758 AA580602 MB62006 AI863841 AI097667 AI928583 
AI358774 BE243487 AA620553 AA653297 AA292690 T1 01 1 0 Z38906 AA908544 AA340930 AI1 B5438 T03328 T28844 AI68701 0 AI864965 
AI872575 BE388740 T56780 AW373138 BE258717 M699671 

100906 4312J AU076916 BE2981 1 0 AW239395 AW572700 NM_003875 U10860 AW651755 BE297958 C0380B AI795B76 M644165 T36030 AW392852 

AA446421 AW881 866 AI469428 BE5481 03 T96204 R94457 N78225 AI564549 AW004984 AW780423 AW675448 AW087890 AA971454 AA305698 
AA879433 AA535069 AI394371 AA928053 AI378367 N59764 A1364000 AI431285 T81 090 AW674657 AW674987 AA897396 AW673412 BE063175 
AW674408 AI20201 1 R00723 AI753769 AI4601 61 AW079585 AW275744 AIB73729 D25791 BE537646 T81 1 39 R00722 

100930 16865 1 J04129NM 002571 AA293088AA477016AA404531 T28299 AA476904 AA433965 AA43Q4B6 AA495907 A! 1 51 391 AA29 1 495 AA402723 W25651 
AA706816 AI826712 AW296294 AA293479 AI276581 ATO44154 AI080180 AI417985 AI274168 AI474212 AA495908 AA635664 AI0921 1 4 
AI804952 AA479874 AI597661 AI42051 1 AA479738 AA421417 AA421 247 AA436220 AL047797 M34046 N42277 M228076 W02698 AI420297 
AA434011 AI369971 AA479731 AI865541 AI418020M421246 AA452764 AL048051 

102221 3861J NM_006769 U24576 AW161961 AW1 60473 AW1 60465 AW160472 AW161069 AI824B31 AW162635 AI990356 AW162477 AW162571 AI520836 
AW162352 AW162351 AW162752 AI962216 AI537346 AA853902 H17667 BE045346 BE559802 BE255391 AA985217AA235051 AI129757 
AW366451 T34489 D561 06 D56351 AI936579 AW023219 AW889335 AW8891 20 AW889232 AW8891 75 BE093702 AW889349 AA147546 
AI952993AA912579AI143356AW902211 R64717 AW157236AI815242 D45274 AW263991 AA442920AA129965 AL035713 AI923255 AI949082 
AI142826 AI6841 60 AI701 987 AI678954 AI827349 BE463635 AW628092 AW302281 AA493203 BE348656 BE536419 AW1 93969 AW673561 
AW592609 A1224044 H43943 AA091912 R49632 R48353AI568409 R48256 AI198046 H27986 H43899 AI678759 AI680310 AI624220 H17052 
AA156410 N56062 AI699430 AA664529 TQ9406 T1 045S AA527506 A13795S4 N83831 N88633AW022651 AA971281 AA248036 AI039197 
AB14689 AA973825 AL047305 AA129966 AI79B369 AW26434B AI445B79 AI65B759 N67924 AI933507 AI216121 AI333174 T10972 AI375028 
AI186756 AI273778 AA610487 AI797946 AA853903 AA903939 AI338587 AI278494 AW627595 AA904019 

101809 32963J M86849 AA315280 NMJXMOM AA315269 BE1 42653 AA461 400 AW802042 BE1 52893 AW383155 AA490688 AW117930 AW384563 AW384544 
AW384566AW378307AW378323AW839085AA257102AW378317AW276060AW271245AW37829BAW384497AI598114AW264544AI018136 
AW02181 0 M961504 AW086214 AW771489 AW1 92483 AI290266 AW1 92488 AW384490 AW007451 AW890895 M554460 AA613715 
AW020066 AI783695 AI589498 AI917637 AW26447 1 AVV334491 AI81 6732 AW368530 AW368521 AW368463 M4610B7 AI341438 AI970613 
AI040737 AI418400 AA947181 AA962716 AI280695 AW769275 AW023591 AM 60977 AA055400 N71BB2AA490466 AW243772AW316635 
AI076554 AW51 1702 N69323 H8891 2 AA25701 7 AI952506 H88913 AI912481 AA600714BE465701 N64149 C00523N64240AA577120 

102590 15932J R61573 BE005029 X98091 AA297307 BE537257 BE566138 BE5661 39 F1 1 561 BE564795 BE568776 AW064005 BE566479 BE380035 BE567012 
BE568634 BE566568 AA298060 BE566043 BE56881 3 BE568618 AA283070 BE56541 4 BE566738 BE568585 BE5656B7 BE5661 16 BE556433 
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Tables 2A-8C were previously filed on November 9, 2001 in USSN 60/339,245 (18501-004100US) 

Table 2A shows 504 genes down-regulated in lung tumors relative to normal lung and chronically diseased lung. Chronically diseased lung samples represent chronic non- 
maJignant lung disea ucl i is, emph ma d bronchi r r r Se genes were selected from 59680 probesets on the Eos/Affymetrix Hu03 Genechlp array. Gene 
expression dalafor each probeset obtained from this analysis was expressed as average intensity (Al), a normalized value reflecting the relative level of mRNA expression. 

Pkey: Unique Eos probeset identifier number 

ExAccn: Exemplar Accession number, Genbank accession number 

UnigenelD: Unigene number 

Unigene Title: Unigene gene title 

R1 : 90th percentile of Al for normal lung samples divided by the 80th percentile of Al for adenocarcinoma and squamous cell carcinoma lung tumor 

R2: meTanof Al for normal lung samples divided by 90th percentile of Al for adenocarcinoma and squamous cell carcinoma lung tumor samples. 

R3: median of Al for normal lung samples minus the 15thpe nlil< ifAll i rmal ung .nronically diseased lung and tumorsamples divided by 

the 90th percentile of Al for adenocarcinoma and squamous cell carcinoma iung tumor samples minus the 15th percentile of Al for all normal 

lung, chronically diseased lung and tumor samples. 
R4: average of Al for normal lung samples divided by average A: for squamous cell carcinoma and adenocarcinoma lung tumors. 

R5: median of A! for normal !arq samples divided by the 90th percentile of A! for adenocarcinomas. 

R6: median of Al for normal lung samples minus the 15th percentile of Al for al! normal :ung, chronically diseased lung and tumor samples divided by the 90th 

percentile of Alter enocarcinom njsth r i I i allu liion I eased lung and tumor samples. 

R7: average of Al for normal lung samples divided by the 90th percentile of Al for squamous cell carcinomas. 

R8: median of Al foi nun, ing rnples mir the 1 | e enti e t y a nui nal nq ironically diseased lung and tumorsamples divided by the 90th 

percentile of Al for squamous cell carcinomas minus the 15th percentile of Al for all normal lung, chronically diseased lung and tumor samples. 







UnigenelD 


Unigene Title 


100095 


Z97171 


Hs.78454 


myocilin; trabecular meshwork inducible 


100115 


NM.002084 


Hs.336920 


glutathione peroxidase 3 (plasma) 


100138 


U83508 


Hs.2463 


angiopoietin 1 


100299 


D49493 


ls.2171 


growth differentiation factor 10 


100306 


U86749 




ran ,i ih m elongation factor A (SII); 


100447 


NMJ14767 


Hs.74583 


KIAA0275 gene product 


100458 


S74019 


Hs.247979 


Vpre-B 




M005247 


Hs.285754 


Hepatocyte Growth Factor Receptor 


100959 


AA359129 


Hs.118127 


actin; alpha; cardiac muscle 


101032 


BE206854 


Hs.46039 


phosphoglycerate mutase 2 (muscle) 


101081 


AF047347 


Hs.4880 


amyloid beta (A4) precursor protein-bind 


101088 


X70697 


Hs.553 


solute carrier family 6 (neurotransmitte 


101125 


AJ250562 


Hs.82749 


transmembrane 4 superfamily member 2 


101180 


U11874 


Hs.846 


interleukin 8 receptor; beta 


101308 


L41390 




"Homo sapiens core 2 beta-1,6-N-acetylgl 


101330 


L43821 


Hs.80261 


enhancer of filamentation 1 (cas-likedo 


101345 


NM_005795 


Hs.152175 


Calcitonin receptor-like 


101346 


AI738616 


Hs.77348 


hydroxyprostaglandin dehydrogenase 15-(N 


101397 


M26380 


Hs.180878 


lipoprotein lipase 


101414 




Hs.38069 


complement component 8; beta polypeptide 


101435 


NMJM1100 


Hs.1288 


actin; alpha 1; skeletal muscle 


101507 




Hs.82112 


interleukin 1 receptor; type 1 


101530 


M29874 


Hs.1360 


cytochrome P450; subfamily IIB (phenobar 


101537 


AI469059 


Hs.184915 


zinc finger protein; Y-linked 


101542 


NM.000102 


Hs.1363 


cytochrome P450; subfamily XVII (steroid 


101545 


BE246154 


Hs.154210 


EDG1; endothelial differentiation, sphin 


101554 


BE207611 


Hs.123078 


thyroid stimulating hormone receptor 




AW958272 


Hs.83733 


Intercellular adhesion molecule 2, exon 


101574 


M34182 


Hs.1 58029 


protein kinase; cAMP-dependent catalyti 


101605 


M37984 


Hs.1 18845 


troponin C; slow 


101621 


BE391804 


Hs.62661 


guanylate binding protein 1; interferon- 


101680 


AA299330 


Hs.1042 


Sjogren syndrome antigen A1 (52kD; ribon 


101829 


AW452398 


Hs.129763 


solute carrier family 8 (sodium/calcium 


101842 


M93221 


Hs.75182 


mannose receptor; C type 1 


101961 


AW004056 


Hs.1 68357 


"Hs-TBX2-T-box gene fT-box region) [huma 
uteroglobin 




T92248 


Hs.2240 


102020 


AU077315 


Hs.1 54970 


transcription factor CP2 
aldehyde dehydrogenase 7 


102091 


BE280901 


Hs.83155 




AW025430 


Hs.155591 


forkhead box F1 


102190 


AA723157 


Hs.73769 


folate receptor 1 (adult) 


102202 


NM 000507 


Hs.574 


fructose-bisphosphatase 1 




NMJ07351 


Hs.268107 


Multimerin 


102310 


U33839 




Accession not listed in Genbank 


102397 


U41898 




"Human sodium cotransporter RKST1 mRNA, 




U60115 


Hs.239069 


"Homo sapiens skelela 1 muscle LlrV-protei 


102620 


AA976427 


Hs.121513 


Human clone W2-6 mRNA from chromosome 


102636 


U67092 




"Human ataxia-telangiectasia locus prate 


102667 






solute carrier family 21 (prostaglandin 


102675 


U72512 


Hs.7771 


"Human B-cell receptor associated prate! 


102698 


M18667 


Hs.1 867 


progastricsin (pepsinogen C) 


102727 


U79251 


Hs.99902 


opioid-binding protein/cell adhesion mol 


102852 


V00571 


Hs.75294 


corticotropin releasing hormone 


103026 


X54162 


Hs.79386 


thyroid and eye muscle autoantigen D1 (6 


103028 


X54380 


Hs.74094 


pregnancy-zone protein 


103098 


M86361 




Human mRNA for T cell receptor; clone IG 


103117 


X63578 


Hs.295449 


paralbumin 


103241 


X76223 




H.sapiens MALgene exon 4 


103280 


U84722 


Hs.76206 


Cadherin 5, VE-cadherin (vascular epithe 


103360 


Y16791 


Hs.73082 


keratin; hair; acidic; 5 
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BE244667 


























Hs. 143067 






BE219898 


Hs 173135 
















Hs. 31297 






AW960427 


Hs.79059 


ESTs; Moderately similar io TGF-BETA REC 




AW975687 




















ESTs; Weakly similar to SliM protein [ 












AA01 0539 
















AI039243 
















AA035613 


































Hs, 36529 






















































































AI904740 
















AA045290 












KIAA0554 protein 






































Hs 21 103 


Homo sapiens mRNA; cDNA DKFZp564B076 (fr 


106842 


AF1 24251 


Hs!26054 


novclSH2-containing protein 3 




AA485055 


Hs.158213 


sperm associated antigen 6 


106870 








106943 


AW888222 


Hs!9973 


ESTs 


106954 


AF128847 


Hs.204038 


ESTs 




























U90545 








































AA017291 


Hs.60781 


























ESTs' Weakly similar to !!'! ALU SUBFAMI 












































AA078899 








AA079126 








AL1 33092 


Hs. 68055 




































108625 


AW972330 


Hs.283022 


ESTs 


108629 


AA102425 




"zn24o6.s1 Stratagene neuroepithelium NT 


108655 


M099960 




■zm65c6.s1 Stratagene fibroblast (#93721 


108756 


AA1 27221 


Hs.1 17037 


Homo sapiens mRNA; cDNA DKFZp554N1 154 (f 


108864 


AI733852 


Hs.199957 


ESTs 


108895 


AL1 38272 


Hs.62713 


ESTs 


108921 


AI568801 


Hs.71721 


ESTs 


108967 


AA142989 


Hs.71730 


ESTs 
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WO 02/086443 

109001 AI055548 Hs.72116 

109003 AA147497 Hs.71825 

109004 M156235 Hs.139077 
109065 AA161125 Hs.252739 
109250 H83784 Hs.62113 
109490 M233416 Hs.139202 
109510 AI798863 Hs.87191 
109578 F02208 Hs.27214 

Hs.311662 
Hs.27519 
Hs,23540 



Hs.127842 



109613 H47315 

109650 R31770 

109682 H18017 

109724 D59899 

109782 AB020644 

109833 R79864 Hs.29889 

109837 H00656 Hs.29792 

109977 T64183 Hs.282982 

109984 AI796320 Hs.10299 

110146 H41324 Hs,31581 

110271 H28985 Hs.31330 

110280 AW874263 Hs.32468 
Hs.1 84261 



r s, Moderately similar to hedgehog-int 



rs; Weakly similar to PHOSPHATIDYLETHA 



ng fatty acyl-CoA synthetase 2 gene 
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ESTs; Moderately similar to SYNTAXIN IB 



Hs.9218 
Hs.23666 
Hs.18827 
Hs.21559 
Hs.8022 
Hs.22945 



Hs.5740 
Hs.321677 
Hs.279798 
Hs.28694 



110420 R93141 

110578 T62507 Hs.11038 

110634 R98905 Hs.35992 

110726 AW961818 Hs.24379 

110837 H03109 Hs.108920 

110875 N35070 Hs.26401 

110894 R92356 Hs.66881 

110971 AI760098 Hs.21411 

111023 AV655386 Hs.7645 

111057 T79639 Hs.14629 

111247 AW058350 Hs.16762 

111330 BE247767 Hs.18166 

111374 BE250726 Hs.283724 

111442 AW449573 Hs.181003 

111737 H04607 

111747 AI741471 

111807 R33508 

111862 R37472 

112045 AI372588 

112057 R43713 

112214 AW148552 Hs.167398 

112263 R52393 Hs.25917 

112314 AW206093 Hs.748 

112324 R55965 Hs.26479 

112362 AW300887 

112380 H63010 

112425 AA324998 

112473 R65993 

112492 N51620 

112541 AF038392 Hs.1 16674 ESTs 

112620 R80552 Hs.29040 ESTs 

112623 AW373104 Hs.25094 ESTs 

112867 T03254 Hs.1 67393 ESTs 

112894 T08188 Hs.3770 ESTs 

112954 AA928953 Hs.6655 ESTs 

113029 AW081710 

113086 AA346839 Hs.209100 

113140 T50405 

113252 NM_004469 Hs.1 1392 

113257 AI821378 Hs.159367 

113394 T81473 

113437 T85349 

113454 AI022166 

113502 T89130 

113552 AI654223 

113645 T95358 

113691 T96935 

113706 AA004693 Hs.269192 

113883 U89281 Hs.1 1958 

113924 BE178285 Hs.170056 

114035 W92798 Hs.269181 

114058 AK002016 Hs.114727 

114084 AA708035 Hs.12248 

114121 H05785 Hs.25425 

114124 W57554 Hs.125019 

114275 AW515443 Hs.3061 17 

114297 AA149707 Hs.1 73091 

114427 AA017176 Hs.33532 

114449 AA020736 

114452 AI369275 Hs.243010 

114609 AA079505 

114648 AA101056 

114731 BE094291 Hs.155651 

114762 AA146979 Hs.288464 



potassium voltage-gated channel; shaker- 
ESTs; Weakly similar to semaphorin F [H. 
tumor necrosis factor (ligand) superfami 
ESTs; Moderately similar to cytoplasmic 



Homo sapiens mRNA; cDNA DKFZp564B2062 (f 
KIAA0870 protein 

ESTs; Moderately similar to HYA22 [Rsap 



Hs.175967 ESTs 



Hs.177894 
Hs.15923 
Hs.16188 

Hs.16026 
Hs.333181 
Hs.17932 



ESTs 
ESTs 

Human lymphoid nuclear protein (LAF-4] 
interleukin 1 3 receptor; alpha 1 
DKFZP434K151 protein 
ESTs; Highly similar to Miz-1 protein [H 
"ze63b1 1 ,s1 Soares reiina N2b4HR Homo sa 
ESTs, Moderately similar to RTCO.HUMAN G 
zm97 j a 3g re )lor r 2 

•zn25b3.s1 Siratagene neuroepithelium NT 
Homo sapiens HNF-3bela mRNA for hepatocy 
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AA151719 
















AW01 5947 




ESTs;"Weakly similar to hypothetical L1 




AW964897 


Hs.290825 






















































AA486620 


Hs.41135 














MO0 1732 


Hs. 173233 














AA454033 










Hs.55278 






AB029496 


Hs.59729 






AA292105 


Hs, 326740 


























ESTs; Weakly similar to testicular tekti 






Hs.45220 






AB007979 
















AB023179 








































































BE219453 








AW444761 










Hs.44583 






























N51075 
































AL1 09667 


















Hs.49193 
























Hs.50813 


ESTs; Weakly similar to long chain fatty 






Hs. 54522 






A1979247 


Hs.247043 


KIAA0525 protein 






H s. 226142 






N94591 


Hs.323056 






BE245360 


Hs.279477 


















Accession not listed in Genbank 
























ESTs" Moderately similar to !!" ALU SUB 




W84346 










Hs.58815 














AA81 1 339 






120132 




Hs.125019 


Human lymphoid nuclear protein (LAF-4) 




AA223249 








AB023230- 




KIAA1013 protein 




























AA287702 
















AA400205 


Hs. 104447 






AA400914 
















AI74351 5 






















121545 


AA412442 


Hs.98132 


ESTs 


121622 


AA416931 


Hs.1 26065 


ESTs 


121665 




Hs.98234 


ESTs 


121709 


AI338247 


Hs.98314 


Homo sapiens mRNA; cDNA DKFZp586U1 


121730 


AI140683 


Hs.98328 


ESTs 


121740 


AA421138 


Hs.98334 


EST 


121772 


AI590770 


HS.1 10347 


Homo sapiens mRNA for alpha integrin bin 


121821 


AL040235 


Hs.3346 


ESTs 
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Hs.241519 
Hs.95497 
Hs.107149 
Hs. 12-4685 
Hs.108135 



olfactory receptor; family 7; subfamily 



WO 02/086443 

121835 AB033030 Hs.300670 ESTs 

121841 M427794 Hs.104864 ESTs 

121885 M934883 Hs.98467 ESTs 

121888 M426429 Hs.98463 ESTs 

121938 M428659 Hs.98610 ESTs 

121950 AA429515 EST 

122030 AA431310 Hs.98724 ESTs 

122054 M431725 Hs.98746 EST 

122211 AA300900 Hs.98849 ESTs; Moderately similar to bilhoraxoid- 

122233 AA436455 Hs.98872 EST 

122247 M436676 Hs.98890 EST 

122253 AA436703 Hs.104936 ESTs; Weakly similar to hypothetical pro 

122266 M436840 Hs.98907 EST 

122285 M436981 Hs.121602 EST 

122409 AA446830 Hs.99081 ESTs 

122485 M524547 Hs.160318 phospholemman 

122697 M420683 Hs.98321 Homo sapiens cDNA FLJ14103 lis, clone MA 

122772 AW117452 Hs.99489 ESTs 

122831 AI857570 Hs.5120 ESTs 

122913 AI638774 Hs.105328 ESTs 

123049 BE047680 Hs.21 1869 ESTs 

123076 AI345569 Hs.190046 ESTs 

123136 AW451999 Hs.194024 ESTs 

123309 N52937 Hs,102679 ESTs 

123465 AA353113 Hs.112497 ESTs 

123691 AA609579 Hs.1 12724 ESTs 

123756 AA609971 Hs. 11 2795 EST 

123802 AA620448 

123837 AI807243 Hs.1 12893 

123844 AA938905 Hs.120017 

123936 NMJ04673 

123987 C21171 

124013 AI521936 

124160 R40290 

124205 H77570 

124226 AA618527 Hs.190266 

124246 H67680 Hs.270962 

124348 AI796320 Hs.10299 

124358 AW070211 Hs.102415 

124409 A1814166 Hs.1 07197 

124442 AW663632 Hs.285625 

124468 N51413 Hs.109284 

124479 AB011130 Hs.127436 

124519 AI670056 Hs.137274 

124711 NMJ04657 Hs.26530 

124866 AI758289 Hs.304389 

124874 BE550182 Hs.127826 

125097 AW576389 Hs.335774 

125179 AW206468 Hs.103118 

125200 AW836591 Hs.103156 

125299 T32982 Hs.102720 

125400 AL1 10151 Hs.128797 

125810 H00083 

126176 BE242256 Hs.2441 

126303 D78841 

126403 AW629054 Hs.1 25976 

126507 AL040137 Hs.23964 

126773 AA648284 Hs.187584 

127307 AW962712 Hs.126712 

127462 AA760776 Hs.293977 

127486 AW002846 Hs.1 05468 

127572 AA594027 Hs.191788 

127609 X80031 Hs.530 

127832 AW975035 Hs.292396 

127898 AA774725 Hs.128970 

128073 AW340720 Hs.125983 

128101 AA905730 Hs.128254 

128149 NMJ12214 Hs.177576 

128212 W27411 Hs.336920 

128333 WS8800 Hs.1 2126 

128364 N76462 Hs.269152 

128426 AI265784 Hs.145197 

128598 AA305407 Hs.102308 

128634 AA464918 

128687 AW271273 Hs.23767 

128726 AI311238 Hs.104476 

128773 NMJ04131 Hs.1051 

128833 W26657 Hs.1 84581 

128870 H39537 Hs.75309 

128878 R25513 Hs.1 0683 

128885 AF134803 Hs.180141 

128998 W04245 Hs.107761 

129000 AA744902 Hs.1 07767 

129038 AW156903 Hs.108124 

129098 AW580945 Hs.330466 



PCT/US02/12476 



>35g1 1 .s1 Morton Fetal Cochlea Homo sa 
ESTs 

TATA box binding protein (TBP)-associate 
ESTs 

calcium channel; voltage-dependent; alph 
ESTs; Weakly similar to SPUCEOSOME ASSO 
serum deprivation response (phosphatidyl 



ESTs 

DKFZP586D0824 protein 

aryl hydrocarbon receptor-interacting pr 

K1AA0022 gene product 

HUM525A05B Human placenta potyA* (TFufi 

ESTs; Weakly similar to metaltoprotease/ 

ES^s; Weakly similar to HC1 ORF [M.muscu 



aa59b04.s1 NCI_CGAP_G< 



mannosyl (alpha-1;3-)-glycoprotein beta- 
glutathione peroxidase 3 (plasma) 
ESTs; Weakly similar to LR8 [H.sEpicn;;] 
ESTs; Weakly similar to ZINC FINGER PROT 



granzyme B (granzyme 2; cytotoxic T-lyrap 
ESTs 

eukaryotic translation elongation factor 
ESTs 

cofilin 2 (muscle) 

ESTs A'eakly similar to PUTATIVE RHO/RAC 
ESTs; Moderately similar to CaM-Kllinhi 
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129240 M361258 Hs.237868 

129262 BE222198 Hs.109843 

129301 AF182277 Hs.330780 

129331 AW167668 Hs.279772 

129381 AW245805 Hs.110903 

129565 X77777 Hs.198726 

129595 U09550 Hs.1154 

129613 AW978517 Hs.172847 

129782 AW016932 Hs.104105 

129950 F07783 Hs.1369 

129958 R27496 Hs.1378 

129959 AL036554 Hs.274463 
130160 M305688 Hs.267695 
130259 NM_000328 Hs.153614 
130273 AW972422 Hs.153863 
130312 AF056195 Hs.15430 
130436 NMJ01928 Hs.155597 
130523 M999702 Hs.214507 
130799 AB028945 Hs.12696 
130885 NMJ05883 Hs.20912 
131002 AL050295 Hs.22039 
131012 AL039940 Hs.202949 
131031 NMJ01650 Hs.288650 
131061 N64328 Hs.268744 
131066 AW169287 Hs.22588 
131082 A1091121 Hs.246218 
131087 AF147709 Hs.22824 
131161 AF033382 Hs.23735 
131179 M171388 Hs.184482 
131182 AI824144 Hs.23912 
131205 NMJ03102 Hs.2420 
131277 M131466 Hs.23767 

131281 AA251716 Hs.25227 

131282 X03350 Hs.4 
131285 AI557943 Hs.25274 
131355 R52804 Hs.25956 
131391 AW085781 Hs.26270 
131461 AA992841 Hs.27263 
131487 F13036 Hs.27373 
131517 AB037789 Hs.263395 
131545 AL137432 Hs.28564 
131583 AK000383 Hs.323092 
131647 AA359615 Hs.30089 

131675 H15205 Hs.30509 

131676 AI126821 Hs.30514 
131708 S60415 Hs.30941 
131717 X94630 Hs.3107 
131756 AA443966 Hs.31595 
131762 AA744902 Hs.107767 
131821 AA017247 Hs.164577 
131839 AB014533 Hs.33010 
131861 AL096858 Hs.184245 
132015 AI418006 Hs.3731 
132070 BE622641 Hs.38489 
132242 AA332697 Hs.42721 
132334 AW080704 Hs.45033 
132476 AL1 19844 Hs.49476 
132490 NM_0 



ESTs 

Human cytochrome P450-IIB (hllB3) mRNA; 

ESTs; Highly similar to CGI-38 protein [ 

claudin 5 (transmembrane protein deleted 

vasoactive intestinal peptide receptor 1 

oviductal glycoprotein 1; 120kD 

ESTs; Weakly similar to collagen alpha 1 

EST 

decay accelerating factor for complement 
annexinA3 

defensin; alpha 1; myeloid-related seque 
UDP-GatbetaGlcNAc beta 1 ;3-galactosyltr 
retinitis pigmentosa GTPase regulator 
MAD (mo"" 



D component of complement (adipsin) 



adenomatous polyposis coli Tike 
KIAA0758 protein 
KIAA1 102 protein 



DKFZP586D0624 protein 
ESTs 

superoxide dismutase 3; extracellular 



butyrate response factor 2 (EGF-response 
Homo sapiens mRNA; cDNA DKFZp56401 763 (f 
ESTs; Highly similar to semaphorin Via [ 
ESTs 

ESTs; Weakly similar to dual specificity 



calcium channel; voltage-dependent; beta 

CD97 antigen 

ESTs 

ESTs; Moderately similar to CaM-KII inhi 



Hs.172510 : 
Hs.530 
Hs.53447 
Hs.61260 



132598 X80031 

132619 H28855 

132652 N41739 

132726 N52298 

133028 R51604 Hs.300842 ESTs 

133071 BE384932 Hs.64313 

133120 NM_003278 Hs.65424 

133129 AA428580 Hs.65551 

133147 AA026533 Hs.66 

133151 NMJ14051 Hs.94896 

■ 133213 AA903424 Hs.6786 

133276 AW978439 Hs.69504 

133377 AJ131245 Hs7239 

133407 AF017987 Hs.7306 

133535 AL134030 Hs.284180 

133537 U41518 Hs74602 

133656 BE149455 HS75415 

133689 NM_001872 HsJ5572 



Hs.55608 ESTs; Weakly similar to cDNA EST yk484g1 

ESTs 

ESTs 

tetranectin (plasminogen-binding protein 
ESTs 

interleukin 1 receptor-like 1 



133978 AF035718 Hs.78061 

133985 L34657 Hs78146 

134000 AW175787 Hs.334841 

134111 AI372588 Hs.8022 

134185 AA285136 Hs.301914 

134204 AI873257 Hs7994 



ESTs 
transcription factor 21 
platelet/endothelial cell adhesion moleo 
selenium binding protein 1 
TU3A protein 

--'--mRNA;cDNADKFZ P 586K1220(f 



ml 
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134677 AA251363 Hs.177711 ESTs 



Hs.1561 14 protein tyrosine phosphatase; non-recept 



134745 

134749 ' 

134786 ' 

134825 I 

134978 , 
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fs.89485 carbonic anhydrase IV 

1s.89640 angiopoietin 1 receptor; TEK tyrosine ki 

Hs.1 97764 thyroid transcription factor 1 

ls.333383 ficolin (collagen/fibrinogen domain-cont 



syntrophin; beta 1 (dyslrophin-associate 



135236 AI636208 Hs.95901 

135266 R41179 Hs.97393 

135346 NM.O00928 Hs.992 

135378 AW961818 Hs.24379 

135387 NM.001972 Hs.99863 

135388 W27955 Hs.99865 
135402 L12398 Hs.99922 



Human mRNA for KIAA0328 gene; partial cd 
phospholipase A2; group IB (pancreas) 
potassium voltage-gated channel; shaker- 
elastase 2; neutrophil 
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TABLE 2B shows the accession numbers for those primekeys lacking unigenelD's for Table 2A. For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from Genbank ESTs and mRNAs. These sequences were clustered based on seq 
similarity using Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers for sequences comprising each cluster are listed in IhE 
"Accession" column. 



Accession: Genbank accession numbers 



CATnu 



108447 
108550 
108655 
102397 
126303 
125810 
103627 
121366 



43452.-7 
120073.1 
127522J 



116777.1 
172113 1 
11 21 86 J 
114012.1 



AA079126 

AA084867AA084996 
AA099960 AA113013 
U41898 

D78841 D78880 
H00083 R81062 
Z48513Z48512 

AI743515AA405617AW276706 
AA079505AA079537 
AW01 5947 AA21 1890 AA279425 
AA070773AA070774 
AA078899 M078782 AA075788 
genbank_AA620448 AA620448 
NOT_FOUND_entrez_U33839 U33839 
enlrez_U67092 U67092 
genbank_AA026349 
genbank_AA256837 
genbank_T89130T89130 
genbank_AA083103 
entrezj.41390 L41390 
genbank_AA102425 



AA256837 



M1 02425 

M86361 Z26593 X02850D13070AE00065S M17649 M87869 M87871 X61077 M15286 AF018169 X61079 S59351 X60142 AF043169 
entrez_X76223 X76223 
entrez_Y10141 Y10141 
cntrcz_Z26256 Z26256 
NOT_FOUND_entrez_W37937 W37937 
genbank. AA398722 M398722 
AA464918_at AA464918 
genbank_AA397825 M397825 
genbank_AA412155 AM12155 
genbank_AA020736 AA020736 
genbank_AA101056 AA101056 
genbank_AA429515 M429515 
genbank.AA015967 AA015967 
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Table 3A shows 452 genes up-regulated in chronically diseased lung relative to normal lung. Chronically diseased lung samples represent chronic non-malignant lung diseases 

" jse genes were selected from 59680 probesets on the Eos/Affymetrix Hu03 Genechip array. Gene expression data for each 

ed as average intensity (Al), a normalized value reflecting the relative level of mRNA expression. 



Unigene gene title 

80th percentile of Al for chronically diseased lung samples divided by the 90th percentile of Al for normal lung samples. 

80th percentile of Al for chronically diseased lung samples divided by Ihe 90th percentile of normal lung samples, squamous cell carcinomas am 



70th percentile of Al for chronically diseased lung samples minus the 15th percentile of Al for al! norma! lung, chronically diseased lung and tumor samples 
divided by the 90th percentile of normal lung samples, squamous cell carcinomas and adenocarcinomas minus the 15th percentile of Al for all normal lung, 
chronically diseased lung and tumor samples 



30 
35 
40 
45 
50 
55 





















Hs 138751 


Human BRCA2 region, mRNA sequence CG030 




























































































































































































































































solute carrier family 35 (CMP-sialic aci 




















































































KIM0871 protein 
























AF035718 




transcription factor 21 












































































































AF085983 


Hs.293fa76 


















































































AB023177 


Hs.29900 
















SAC2 (suppressor of actin mutations 2, 
















' 






132548 


X12830 


Hs.1 93400 


interleukin 6 receptor 




7.20 




132476 


AL1 19844 




Homo sapiens clone TUA8 Cri-du-chat regi 








132439 


AK001942 


Hs!4863 


hypothetical protein DKFZp566A1524 






1.88 


132240 


AB018324 


Hs.42676 


KIM0781 protein 


21.20 






132210 


NMJ07203 


Hs.42322 


A kinase (PRKA) anchor protein 2 






1.99 


132199 


AL041299 


Hs.165084 


ESTs 


15.20 






131751 


T96555 


Hs.31562 


ESTs 






1.76 


131745 


AIB28559 


Hs.31447 


ESTs, Moderately similar to A46010 X-li 


27.80 






131694 


NM 000246 


Hs.3076 


' HC class I transaclivator 








131686 


NMJ512296 


Hs.30687 


GRB2-associated binding protein 2 








131676 


AI126821 


Hs.30514 


ESTs 








131629 


Z45794 


Hs.238809 


ESTs 


21.40 






131589 


C18825 


Hs.29191 


epithelial membrane protein 2 








131536 


M019201 


Hs.269210 


ESTs 




9.40 




131517 


AB037789 


Hs.263395 






3.59 




131355 


R52804 


Hs.25956 


□KFZP564D206 protein 




4.48 




131253 


R71802 


Hs.24853 


ESTs 


15.00 






131207 


AF104266 


Hs.24212 


latrophilin 






1.75 


131156 


AI472209 


Hs.323117 


ESTs 






1.84 


131066 


AW1 69287 


Hs.22588 


ESTs 




3.54 




131061 


N64328 


Hs.268744 


KIAA1796 protein 








131053 


M348541 


Hs.296261 


guanine nucleotide binding protein (Gpr 






1.93 


130895 


AA641767 


Hs.21015 


hypothetical protein DKFZp564L0864 slmil 


16.60 






130762 


D84371 


Hs.1898 


paraoxonase 1 


12.00 
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0657 AW337575 Hs.201591 

130655 AI831962 Hs.17409 

130589 AL110226 Hs.16441 

130562 D50402 Hs.1 82611 

130555 R69743 Hs.116774 

130365 W56119 Hs.155103 

130273 AW972422 Hs.1 53863 

130259 NM_000328 Hs.153614 

i H97878 Hs.132390 

: R27496 Hs.1378 

l AI672731 Hs.13256 

: AA181018 Hs.13056 

AB007899 Hs.1 201 7 



PCT/US02/12476 



129565 X77777 



Hs.98314 
Hs.1 98726 
Hs.270847 
Hs.11112 



cysteine-rich protein 1 (intestinal) 
DKFZP434H204 protein 
solute earner family 11 (proton-coupled 
integrin, alpha 1 

eukaryotic translation initiation factor 
MAD (mothers against decapentaplegic, Dr 
retinitis pigmentosa GTPase regulator 
zinc finger protein 36 (KOX18) 
annexin A3 

hypothetical protein FLJ13920 
homolog of yeast ubiquitin-protein ligss 
ferritin, light polypeptide 
Homo sapiens oDNA FLJ12566 fis, clone NT 

mRNA; cDNA DKFZp586L0120 (f 



Hs.1 10334 
Hs.237868 
Hs.202949 

H S 2 0 6 95 U 

Hs302043 
Hs.139851 
Hs.186709 
Hs.296460 



129402 W72062 

129385 AA172106 

129315 NM.014563 Hs-174038 

129312 T97579 

129240 AA361258 

129210 AL039940 

129122 AW958473 

129057 N90866 
Y13153 

128798 AF015525 

128789 AW368576 

128778 AA504776 

128766 AW160432 

128631 R44238 

128624 BE1 54765 

i NMJM3616 

128603 NMJW4915 
AA305407 
H55864 

128061 AF150882 

127968 AA830201 

""""" AI302471 

127944 AI557081 

127925 AA805151 



127859 AA761802 

127817 AA836641 

127742 AW293496 

127628 AI240102 

127609 X80031 

127582 AA908954 

127543 AK000787 

127535 AA568424 

127404 AI379920 

127396 L31968 

AA442797 

127346 AA203616 

127340 BE047653 

127307 AW962712 

127242 AW390395 

127167 AA625690 

127046 AA321948 

126928 AA480902 

126900 AF137386 

126852 AA399961 

126816 AA248234 

126812 AB037860 

126666 AA648886 

126645 AA316181 

126592 AI611153 

126556 AF255303 

126433 AA325606 

126299 AW979155 

126218 AL049801 

126182 AA721331 

126177 AW752782 

126142 H86261 

126077 M78772 

125994 AI990529 

125934 AA193325 

125847 AW151885 

125831 H04043 
R61771 

125676 BE612918 

125581 F18572 
H09701 



Rag C protein 

spondyloepiphyseal dysplasia, late 
ESTs, Weakly similar to I78885 serine/lh 
interleukln 7 receptor 
KIAA1102 protein 

nudix (nucleoside diphosphate linked moi 
CDW52 antigen (CAMPATH-1 : " ' 



chemokine (C-C motif) receptor-like 2 

ESTs,' Weakly similar to 138022 hypothet 
craniofacial development protein 1 
KiAAIOSO protein; Gclgi-associaied, gamrn 
ES fs / -aHy -imilar to TRHY.HUMAN TRICH 




Hs.1 73933 
Hs.1 51 999 
Hs.61635 
Hs.6093 
Hs.112227 



Hs.13649 
Hs.293771 
Hs.1 29750 
Hs.40568 
Hs.210836 
Hs.270799 
Hs.32646 
Hs.249034 



ESTs, Weakly similar to ZN91.HUMAN ZINC 
ESTs, Weakly similarto AF191020 1 E21G5 
cathepsin S 



plasmolipin 

gb:zu68c01.r1 Soares_testis_NHT Homo sap 
gb:csg2228.seq.F Human fetal heart, Lamb 
nuclear factor l/A 



sixtran; 



Homo sapiens cDNA: FLJ22783fis, clone K 
membrane-associated nucleic acid binding 
gb:EST28707 Cerebellum II Homo sapiens c 
amino acid transporter 2 
Novel human gene mapping to chornosome 1 3 
ESTs 

hypothetical protein FLJ10546 



hypothetical prote.n FLJ21901 
ESTs 

gb:yj45c03.r1 Scares placenta Nb2HP Homo 
ESTs 

hypothetical protein FLJ23511 
ESTs, Weakly similar to ALU4_HUMAN ALU S 
l,I / kly similar ' E 22 h> he! 
ESTs, Moderately similar to ALU7.HUMAN A 
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10 
15 
20 



Hs.161378 
Hs.183745 
Hs.102541 
Hs.9788 

Hs!26530 
Hs.270594 
Hs.231500 
Hs.42322 
Hs.102670 
Hs.11090 

Hs.293039 
Hs.151323 
Hs.101689 
Hs.170278 

123972 T46848 Hs.70337 
123961 AL050184 Hs.21610 
123936 NM_004673 Hs.241519 
123802 AA620448 
123734 M609861 
123619 M602964 
123596 M421130 
123476 M384564 
123340 AA504264 



125309 T12411 

125167 AL137540 

125139 AW194933 

125042 T78906 

124711 NM_004657 

124631 NMJ14053 

124578 N68321 

124574 AL036596 

124472 N52517 

124438 BE178536 

124357 N22401 

124306 AW973078 

124214 H58608 

124097 AW298235 

123978 T89832 



1s,312447 



Hs.106771 

Hs.98506 
Hs.1 93784 I 
Hs.300670 
Hs.1 78098 . 
Hs.110286 I 
Hs.1 93767 ! 
Hs.98175 
Hs.1 26065 
Hs.97901 
Hs.287727 
Hs.182538 [ 
Hs.97509 



123136 AW451999 

123073 AA485061 

123055 AA482005 

122699 M456130 Hs.301721 

122679 AA811286 Hs.1 92837 

122633 NMJ01546 Hs.34853 

122553 AA451884 Hs.190121 

122544 AW973253 Hs.292689 

122485 AA524547 Hs.160318 

122211 AA300900 Hs.98849 

122127 AW207175 

122011 AA431082 

121992 AI860775 

121989 W56487 

121835 AB033030 

121726 AF241254 

121690 AV660305 

121643 M640987 

121633 AA417011 

121622 AA416931 

121497 AA412031 

121351 AW206227 

121314 W07343 

121242 M400857 

121059 AA393283 

120934 AA226198 

120755 M312934 

120637 M811804 

120484 M253170 

120336 N85785 

120266 AI807264 

120132 W57554 

120041 AA830882 



119970 AA767718 

119861 W78816 

119824 W74536 

119740 AW021407 

119271 AI061118 

119221 C14322 

119126 R45175 

119073 BE245360 

118928 AA312799 

118901 AW292577 

118661 AL137554 

118607 AI377444 

118449 AI813885 

118416 N66028 

118379 N64491 

118329 N63520 

118320 N63451 

118253 AA497044 

118124 N56968 

118056 AB037746 

118032 N52802 

117840 T26379 

117404 N39725 

117314 N32498 



Hs.141600 
Hs.20887 
Hs.45707 
HS.4276B 
Hs.47544 
Hs.48802 
Hs.15220 
Hs.42829 



hypothetical protein FLJ13456 
nelrin 4 

hypothetical protein MGC1 0924 similar to 
E I f ierat / similar t ALU1 I IMAN 
serum deprivation response (phosphatidyl 
FLVCR protein 
EST 

A kinase (PRKA) anchor protein 2 



ESTs 



Hs.1 82937 
Hs.1 05228 
Hs.194024 
Hs.1 05652 



DKFZP434B203 protein 
angiopoietin-likel 
gb:ae58o09.s1 Stratagene lung cai 
ESTs 

gb:no97c02.s1 NCI_CGAP.Pr2 Ht 

EST 

ESTs 

peptidylprolyl isomerase A (cyclopl 



i, Weakly similar to reverse transcri 
KIAA1255 protein 

ESTs, Weakly similar to ALU5.HUH/AN ALU S 
inhibitor of DNA binding 4, dominant neg 



gb;zw78a10.s1 Soaresjesi 



gb:zt74e03.r1 Soares_testis_NHT Homo sap 
gb:nc26a07.s1 NCI_CGAP_Pr1 Homo sapiens 
Homo sapiens cDNA: FLJ21326 fis, clone 
gb:ob39a05.s1 NCI_CGAP_GCB1 Homo sapiens 



tryptase beta 1 
ESTs 



Hs.49943 

Hs.1 84 

Hs.21068 

Hs.65328 

Hs.250700 

Hs.1 171 83 

Hs.279477 

Hs.283689 

Hs.94445 

Hs.49927 

Hs.54245 

Hs.1 64478 



Hs.48990 ESTs 



hypothetical protein FU10512 
ESTs, Weakly similar to S65657 alpha-1C- 
advanced glycosylate end product-speci 
hypothetical protein 

Fanconi anemia, complementation group F 



activator of CREM in testis 
ESTs 

protein kinase MYD-SP15 

ESTs, Weakly similar to S65824 reverse 1 

hypothetical protein FLJ21939 similar to 



gb:yy62f01.s1Soares_multiple_sclerosis_ 
ESTs, Weakly similar to attemativelys 
hypothetical protein FLJ10392 
chromosome 21 open reading frame 37 
hypothetical protein DKFZp761O0113 
• EST 
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116784 AB007979 

116766 AI608657 

116712 AW901618 

116707 H10344 

116351 AL1 33623 

116279 AW971248 

116166 AL039940 

116152 AL040521 

116117 BE613410 

116107 AL133916 

115965 M001732 

115955 AF263613 

115844 AI373062 

115683 AF255910 

115673 AA406341 

115672 AI889110 

115566 AI142336 

115313 M808001 

115279 AW964897 

115230 AA278300 

115110 AK001671 

114999 BE246481 

114930 M237022 

114922 AA235672 

114837 BE244930 

114769 AA149060 

114761 AA143781 

114736 AI61 0347 

114596 AA310162 

114518 AW163267 

114455 H37908 

114452 AI369275 

114359 NMJ16929 

114357 R41677 

114251 H15261 

114138 AW384793 

114124 W57554 

113946 AW083883 

113695 T96965 

113606 NM.013343 

113590 R49642 

113560 T91015 

113552 AI654223 

113540 AW152618 

113502 T89130 

113288 AI076838 

113252 NM.004469 

113238 R45467 

113203 AA743663 

113195 H83265 

113089 T40707 

113076 AF033199 

113009 T23699 

112937 AI694320 

112891 T03927 

112794 R97018 

112691 R88708 

112602 AW004045 

112366 AF035318 

112210 R49645 

112064 AL049390 

111998 R42379 

111987 NMJ15310 

111803 AA593731 

111737 H04607 

111605 T91061 

111510 R07856 

111341 AL157484 

111280 M373527 

111247 AW058350 

111232 AI247763 

110942 R63503 

110924 AW058463 

110837 H03109 

110824 AI767183 

110776 AB032417 

110576 H60869 

110369 AK000768 

110099 R44557 

109984 AI796320 

109958 M001266 



Hs.301281 
Hs.95097 
Hs.61935 
Hs.49050 
Hs.82501 
Hs.291289 
Hs.202949 
Hs.15220 
Hs.31575 
Hs.172572 
Hs.173233 
Hs.44198 
Hs.332938 
Hs.54650 
Hs.269908 
Hs.73251 
Hs.43977 
Hs.184411 
Hs.290825 
Hs.124292 
Hs.11387 
Hs.87856 
Hs.188717 
Hs.87491 
Hs.166895 
Hs.296100 



MSTP043 protein 

Homo sapiens mRNA; cDNA DKFZp586N0121 (f 
gb:yp86a10.s1 Soares fetal liver spleen 
Homo sapiens mRNA, chi 



Homo sapiens mRNA; cDNA DKFZp761!071 (fr 
ESTs, Weal n a t - Chair A, Humai 
similar to mouse Xm1 / Dhm2 protein 
ESTs, Weakly similar to ALU1_HUMAN ALU S 
KIAA1 102 protein 
zinc finger protein 106 
SEC63, endoplasmic reticulum translocon 
hypotheti j' prolan FLJ20093 
hypothetical protein FLJ10970 



hypothetical protein MGCS370 

junctional adhesion molecule 2 

Homo sapiens cDNA FLJ11991 fe. clone HE 

ESTs 

Human DNA sequence from clone RP1 1-195N1 



Hs.271616 

Hs.243010 

Hs.283021 

Hs.6107 

Hs.21948 

Hs.15740 

Hs.125019 

lls.37896 

Hs.17948 

Hs.278951 

I ls.142447 

Hs.268626 

Hs.16026 

ils.16757 



suppressor of varl (S.cerevislae) 3-like 

ESTs, Weakly simiiar to ALU8 JIUMAN ALU S 

Homo sapiens cDNA FU14445 fis. clone HE 

chloride inliacellulai channel 5 

Homo sapiens cDNA FU14839 fis, done OV 

ESTs 

Homo sapiens mRNA; cONA DKFZp434E033 (fr 
ESTs 

Homo sapiens cDNA FLJ1351 0 fis, clone PL 
E T I/ n B HUMA 

NAG-7 protein 

ESTs, Weakly similar toALUI HUMAN ALU S 
ESTs 

hypothetical protein FU23191 

gb:ye12d01.s1 Stratagene lung (937210) H 
ESTs 

c-fos induced growth factor (vascular en 



Hs.10305 

Hs.8881 

Hs.270862 

Hs.8198 

Hs.7246 

Hs.6295 

Hs.293147 

Hs.220647 

Hs.203365 

Hs.12533 

Hs.7004 

Hs.22689 

Hs.1 38283 

Hs.6763 

Hs.325823 

Hs.9218 

Hs.194178 

Hs.16355 

Hs.22483 

Hs.19385 

I ts.16762 



ESTs, Weakly similar to S41044 chramosam 
ESTs 

zinc finger protein 204 
ESTs 

ESTs, Weakly similar to T17248 hypotheti 
ESTs, Moderately similar to A46010 X-li 
gb.yq74b08.s1 Soares fetal liver spleen 

ESTs 

Homo sapiens clone 23705 mRNA sequence 
ESTs 

Homo sapiens mRNA; cDNA DKFZp58601 318 (f 



ESTs, Moderately similar to PC4259 fern 
ESTs 

Homo sapiens mRNA; cDNA DKFZp762M1 27 (fr 
CGI-58 protein 

Homo sapiens mRNA; cDNA DKFZp564B2062 (f 



— i r nas and homeoboxes 1 



Hs.26942 
Hs.19545 
Hs.37889 
Hs.107872 
Hs.23748 



HT018 F 
ESTs 



frizzled (Drosophila) homolog 4 
ESTs 

hypothelical protein FLJ20761 
ESTs 

Homo sapiens cDNA FLJ13545 fis, clone FL 
ESTs 
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ESTs Weakly similar to T26845 hypotheti 










gb'zl84c04.s1 Stratagene colon (937204) 








































Homo sapiens mRNA; cDNA DKFZp564G112 (fr 






AA04570B 






















































Ig superfamily receptor LNIR 










ESTs Moderately similar to ALU7 HUMANA 






AA010611 














DKFZP566F21 24 protein 






































































hypothetical protein FLJ20417 


















Hs.26530 


















AA485055 


















































NMJJ07118 










BE613328 
























Homo sapiens mRNA; cDNA DKFZD564B075 (fr 




















































15"20 




























































































AI299139 






























R65998 
























phosphat:dylinositol-4-phosphate 5-kinas 










Homo sapiens mRNA tor KIAA1568 protein, 






















































































AA076049 


Hs.274415 
















104074 


AL1 62039 


H&3H22 


Homo sapiens mRNA; cONA DKFZp434M229 (fr 


11.20 


103749 


AL135301 


Hs.8768 


hypothetical protein FLJ10849 


10.86 


103645 


AW246253 


Hs.7043 


succinale-CoA ligase, GDP-forming, alpha 


12.00 


103554 


AI378826 


Hs.323469 


caveolin 1, caveolae protein, 22kD 






AI815601 


Hs.79197 


CD83 antigen (activated B lymphocytes, i 




103496 


Y09267 


Hs.132821 


flavin containing monooxygenase 2 




103428 


BE383507 


Hs.78921 


A kinase (PRKA) anchor protein 1 


11.20 




X89399 


Hs.119274 


RAS p21 protein activator (GTPase activa 


19.80 
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NM„005574 


























102530 


U60808 


Hs.152981 


CDP-diacylglyceral synthase (phosphatida 


2540 






































































































































































gb:Human alpha satellite and satellite 3 












































224 






















































2.01 


101168 


NHL005308 


Hs.211569 


G protein-coupled receptor kinase 5 


































101066 


AW970254 




mm 1 , n crystal protein 


19.38 




1.91 


100971 


BE379727 


Hs!83213 


fatty acid binding protein 4, adipocyte 








BE245294 


Hs. 180789 










100770 


W25797.com 


Hs.1 77486 


amyloid beta (A4) precursor protein (pro 








100716 


X89887 


Hs.172350 


H1R (hislone cell cycle regulation defec 


1480 






100425 


NM_014747 


Hs.78748 


KIAA0237 gene product 


1&20 






100382 
100351 


D83407 
D64158 


Hs.56045 
Hs!l56007 


Down syndrome critical region gene 1-lik 




4.00 
4,24 
5.20 




100299 
100134 
100108 


D49493 

AA305746 

U09577 


Hs.2171 
Hs.49 
Hs.76873 


growth differentiation factor 10 
macrophage scavenger receptor 1 




2120 


1,79 


100095 
100066 


Z97171 


Hs.78454 


myocilin, trabecular meshwork inducible 


11.29 


5,40 





TABLE 3B shows the accession numbers for those primekeys lacking unigenelD's for Table 3A. For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from Genbank ESTs and mRNAs. These sequences were clustered based on sequence 
similarity using Clustering and Alignment Tools (DoubleTwist, Oakland California). The Gcnbank accession numbers forsequences comprising each cluster are listed In the 
"Accession" column. 



Pkey: Unique Eos probeset identifier number 
CAT number Gene cluster number 
Accession: Genbank accession numbers 



Pkey 


CAT number Accession 




123619 


371681 1 AA602964AA609200 


126433 


127143J AA325606 M099517N89423 


125831 


1522905 1 H04043 D60988 D60337 


126816 


122973 1 AA248234AA090935 


126852 


136135.1 AA399961 AA128347 


121059 


273450.1 AA393283 


AA398628 


120637 


200885J AA811804 AA809404AA286907 AW977624 


122011 


7617-2 M431082 




120934 


177521 1 M2261 98 AA22651 3 AA383773 


123802 


genbank_AA520448 


M620443 


116814 


genbank_H50834 


H50834 


118329 


genbank_N63520 


N63520 


104404 


H58762_at H58762 




104776 


genbankJW)26349 


AA026349 


113502 


genbank_T89130T89130 




101262 


entrez L35854 L35854 




108573 


genbank_AA086005 


AA086005 




entrez_M21305 M21305 




124357 


genbank_N22401 


N22401 


108781 


genbank_AA1 28654 


AA128654 


112794 


genbank_R97018 


R97018 


100351 


entrez_D64153 D64158 




100555 


tlgr_HT2245 M69181 M81 105 U51039 
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Table 4A shows 202 genes up-regulated ir. samples from patients treated with. ;hemolrer=py y -aciothe-spy. "hesc genes were selected from 59680 probesets on thi 
Eos/Affymetrix Hu03 Genechip array. Gene expression data for each prabeset obtained from this analysis wa 



as expressed as average intensity (Al), a normalized value reflecting 



Pkey: Unique Eos prabeset Identifier number 
ExAccn: Exemplar Accession number, Genbar.k accession 
UnigenelD: Unigene number 
Unigene Title: Unigene gene title 
R1: average of Al for samples from patients treated 



with chemotherapy or radiotherapy divided by the average of Al for normal lung samples. 



Pkey 




UnigenelD 








NM_001269 
































Hs.167185 


glutamate receptor, metabotropic 5 






NM_001949 






















Hs.27973 






































AW959908 




















NMJJ01944 




















AA176374 






























U04045 




















NMJJ02202 










AA296874 










U44060 












preferentially expressed antigen in mela 






NM_006183 










NM_001975 


Hs.1 46580 


enolase 2, (gamma, neuronal) 






M13509 


Hs.83169 


matrix metalloproteinasc 1 (inlcrstitial 
















EE27C266 




5T4 oncofetal trophofclast glycoprotein 
















AW015318 


Hs.23165 








AW503733 










BE387790 




hypothetical protein FU20287 








Hs. 283978 








AA767526 


Hs.22030 


paired box gene 5 (B-celi lineage specif 






AL1 57441 




downstream neighbor of SON 






AW965058 












ii8.234u7*1 


Homo sapiens mRNA," cDNA UKf-Zp7o1 bU21 21 ( 






AL134708 










AW970602 




















AI458623 










AB023139 




















AA 443473 


Hs.173684 


















BE409857 


Hs.69499 


hypothetical protein 
















AA219691 










AW978515 


Hs.131915 


KIAA0863 protein 






AK001355 


Hs. 279610 


hypothetical protein FLJ10493 






AW975746 


Hs. 188662 






109384 


AA219172 


Hs.86849 


ESTs 


21.00 


109415 


U80736 


Hs.110826 


trinucleotide repeat containing 9 


31.60 




AA232103 


Hs.189915 








AW967069 




ESTs* 16 " 03 Prt>em 




109633 


AW003785 


Hs!l 70267 




20^40 


109786 


AI989482 


Hs.146286 


kinesin family member 13A 


19.60 


109958 


AA001266 


Hs.133521 


ESTs 


24.00 


110920 


N47224 


Hs.20521 


HMT1 (hnRNP methyltransferase, S, cerevi 


28.40 


110924 


AW058463 


Hs.12940 


zinc-fingers and homeoboxes 1 


36.00 


111084 


H44186 


Hs.15456 


PDZ domain containing 1 


61.20 


111132 


AB037807 


Hs.83293 


hypothetical protein 


24.60 


111229 


AW389845 


Hs.1 10855 


ESTs 


27.20 


111337 


AA837396 


Hs.263925 


LIS1-interacting protein NUDE1, rat homo 


48.00 


111987 


NM 015310 


Hs.6763 


KIAA0942 protein 


37.80 


112046 


AA383343 


Hs.22116 


CDC14 (cell division cycle 14, S, cerevi 


26.80 


112268 


W39609 


Hs.22003 


solute carrier family 6 (neurotransmitte 


63.80 


112685 


R87650 


Hs.33439 


E T ,'vi I i in 1 i UU1 HUMAN ALU 


26.40 


112871 


AL1 10216 
AW206453 


Hs.12285 
Hs.3782 


ESTs, Weakly similar to 155214 salivary 
ESTs 


47.64 
22.00 


112897 
112973 


AB033023 


Hs.318127 


hypothetical protein FU10201 


65.00 


112992 


AL157425 


Hs. 133315 


Homo sapiens mRNA; cONA DKFZp761 J1324 (f 


42.00 


113073 


N39342 


Hs.103042 


microtubuie-associated protein 1B 


55.40 
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AA45721 1 
































































zinc finger protein 83 (HPF1 ) 
















AW966931 




nucleosome assembly protein 1-like 1 




















li>ii-> I t tii -i protein FU10618 




































AW872527 


Hs.59761 


ESTs, Weakly similar to DAP1_HUMAN DEATH 








I Is. 61232 








AL133916 


Hs. 172572 


hypothetical protein FLJ20093 






AA889120 


Hs.1 10637 


homeo box A10 
















AF161470 


Hs,260622 


butyrate-lnduced transcript 1 






AW005054 


Hs.47883 






















gb:za49d07.s1 Soares fetal liver spleen 






AI824009 


Hs.44577 


















AA918317 










AL050097 


Hs. 272531 


DKFZP586B0319 protein 
















AA258356 














achae(e-sci.te complex (Drosopnila) homo! 






AA398209 




















AW450737 


Hs.1 28791 


CGI-09 protein 






























gb:ab19f02.s1 Stratagene lung (937210] H 






























gb:no97c02.s1 NCLCGAP_Pr2 Homo sapiens 
















BE079334 


Hs.271630 








AI333756 


Hs.1 11801 


arsenate resistance protein ARS2 








Hs.1 02670 








AW628168 










NM_014053 


Hs. 270594 


FLVCR protein 








Hs.1 40942 








AA61 0620 




major histocompatibility complex, class 








Hs.1 78294 


















AA628962 




















AL360190 


Hs. 295978 








AW161885 


Hs.249034 








AA 193325 




hypothetical protein FU21 901 








Hs.210836 








AW9791 55 


Hs.298275 








AI468004 


Hs. 278956 


hypothetical protein FLJ12929 






AA325606 












Hs.23850 




















Hs. 151999 








AB037860 




nuclear factor l/A 


























AW771958 




ESTs, Moderately similar to PC4259 tern 
















AW297206 


Hs.1 64018 


















AA805151 












Hs.1 23304 










Hs.1 24347 


















H07103 


Hs. 2860 14 


















AI878918 


Hs.1 0526 


cysteine and glycine-nch protein 2 




128949 


AA009647 


Hs.8850 


adisintegrin and metalioproteinase doma 


23.00 


129168 


AI132988 


Hs.109052 


chromosome 14 open reading frame 2 


37.60 


129404 


A1267700 


Hs.317584 


ESTs 


28.60 


129527 


AA769221 


Hs.270847 


delta-tubulin 


40.80 


129574 


AA026815 




UMP-CMP kinase 


31.20 


129598 


N30436 


Hs.11556 


Homo sapiens cDNA FLJ12566 fis, clone NT 


29,60 


129785 


H19006 


Hs.184780 


ESTs 


72,20 


129970 


AV655806 


Hs.296198 


chromosome 12 open reading frame 4 


22.20 
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130482 AW409701 Hs.1578 

130617 M90516 Hs.1674 

130703 R77776 Hs.18103 ESTs 

130732 AW890487 Hs.63984 

130867 NM.001072 Hs.284239 

131028 AI879165 Hs.2227 

131086 AL035461 Hs.2281 

131284 NM.001429 Hs.25272 

131775 AB014548 Hs.31921 

131860 BE383676 Hs.334 

131945 NM.002916 Hs.35120 

132040 NM.001196 " 

132084 NM.002267 
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protein kinase, DNA-activated, catalytic 
I' r )0 piens :DNA FLJ20653 (is, clone KA 
baculoviral IAP repeat-containing 5 (sur 
glutamine-fructose-6- phosphate transamin 



cadherin 13, H-cadherln (heart) 
UDP glycosyllransferase 1 family, polype 
CCAAT/enhancer binding protein (C/EBP), 



132437 AA152106 

132550 AW959253 

132617 AF037335 

132632 AU076916 

132672 W27721 

132742 AA025480 

132771 Y10275 



133181 X91662 

133282 AA449015 

133350 AI499220 

133592 AV652066 

133658 AA319146 

133865 AB011155 

134032 NM.005025 

134125 NM.014781 

134158 U15174 

134321 BE538082 

134367 M339449 

134570 U66615 

134753 NM.006482 

135002 AA448542 

135029 H58818 

135047 AL134197 



Hs.190044 
Hs.4859 
Hs.170195 
Hs.5338 
Hs.5398 
Hs.54697 
Hs.292812 

Hsi4311 
Hs.66170 
Hs.66744 
Hs.286145 
Hs.71573 



replication factor C (activator 1)4 (37 
Homo sapiens cDNA: FLJ22373 lis, clone H 
karyopherin alpha 3 (importin alpha 4) 



ESTs, Weakly similar to T33468 hy 



Hs.50421 

Hs.79428 

Hs.8172 

lls.82285 

Hs. 1 72280 

Hs.173135 

lls.25167; 

Hs.187579 

Hs.93597 

Hs.99171 



HSKM-B protein 

twist (Drosophila) homolog (acrocephalos 
SRB7 (suppressor of RNA polymerase B, ye 
hypothetical protein FU 1 0074 
general transcription factor ItIA 
secretogranin II (chromogranin C) 
discs, large (Drosophila) homolog 5 
serine (or cysteine) proteinase inhibito 
KIAA0203 gene product 
BCL2/adencvirus E1B 19kD-interading pro 
ESTs, Moderately similar to A46010 X-lin 

SWI/SNF related, matrix associated, acti 
dual-specificity tyrosine-(Y)-phosphoryl 
G antigen 7B 

hydroxysteroid (17-beta) dehydrogenase 
cyclin-dependent kinase 5, regulatory su 



TABLE 4B shows the accession numbers for those primekeys lacking unigenelD's for Table 4A. For each probesel we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from Genbank ESTs and mRNAs. These sequences were clustered based on sequence 

rub iv igClusterii i I Alignment Tools (DoubleTw'st, Oakland California). The Genbank accession numbers forsequences comprising each cluster are listed in the 
"Accession" column. 



123619 371681.1 AA602964 AA609200 

126433 127143J AA325606 AA099517 N89423 

126872 142696.1 AW450979 AA136653 AA136656 AW419381 AA984358 AA492073 BE168945 AA809054 AW238038 BE011212 BE011359 

BE01 1367 BE01 1368 BE01 1362 BE01 121 5 BE01 1365 BE011363 

106851 322947.1 AI458623 AA639708 AA485409 R22065 AA485570 

118720 genbank_N73515 N73515 

120515 genbank_AA258356 AA258356 

1 1 7099 321871.1 H93699 H97976 H80036 

101447 entrez_M21305 M21305 

123130 genbank_AA487200 AA487200 
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Table 5A shows 680 genes up-regulated in squamous cell carcinoma or adenocarcinoma lung tumors relative to normal lung and chronically diseased lung. These genes were 
selected from 59680 probesets on the Eos/Affymetrix Hu03 Genechip array. Gene expression data for each probeset obtained from this analysis was expressed as average 
intensity (Al), a normalized value reflecting the relative level of mRNA expression. 

Unique Eos probeset identifier number 

Exemplar Accessior ' 

UnigenelD: Unigene number 
Unigene Title: Unigene gene title 

R1 : 70th percentile of Al for squamous cell carcinoma and adenocarcinoma lung tumor samples divided by the 90th percentile of Al for normal and chronically 

diseased lung samples. 

R2: 80th percentile of Al adenocarcinoma lung tumor samples divided by Ihe 90th percentile of Al for normal and chronically diseased lung samples. 

R3: 80th percentile of Al squamous cell carcinoma lung tumor samples divided by 1 1 s 90th percentile of Al for normal and chronically diseased lung samples. 

R4: 80th percentile of Al adenocarcinoma lung tumor samples divided by the 80 th percentile of Al for squamous cell carcinomalung tumor samples. 

R5: 70th percentile of Al for squamous ceil carcinoma and adenocarcinoma ling tumor samples minus the 1 5th percentile of Al for all normal lunq chronically 

diseased lung and tumor samples divided by 90th percentile of Al for normal and chronically diseased lung samples minus the 15th percentile of Al for all 

normal lung, chronically diseased lung and tumor samples 



UnigenelD Unigene Title 



100037 

100071 A28102 

100114 X02308 

100154 H60720 

100187 D17793 

100188 AW247090 
100202 BE294407 
100216 AA489908 
100269 NMJ01949 
100287 AU076657 
100297 AU077258 
100330 AW410976 
100335 AW247529 
100360 W70171 
100372 NMJ14791 
100474 NMJ100699 
100486 T19006 
100491 D56165 
100516 D90278 
100522 X51501 



100629 AA015693 

100661 BE623001 

100677 AA353686 

100696 D14887 

100709 N26539 

100761 BE208491 

100830 AC004770 

100867 U14622 

100902 M15029 

100906 AU076916 

100960 J00124 

101045 J05614 

101061 NM 000175 

101071 L02840 

101124 L10343 

101175 U82671 

101181 " 

101204 L24203 

101210 L29301 

101216 AA284166 

101228 AA333387 

101233 AL135173 

101273 Z11933 

101342 U52112 

101346 AI738616 

101369 NM_000B92 

101396 BE267931 

101431 BE185289 

101448 NM_000424 

101462 AL035668 

101466 BE252660 

101484 AA053486 

101502 M26958 

101505 AA307680 

101526 NM_002197 

101535 X57152 

101577 M34353 

101649 AW959908 

101663 NM_003528 

101664 AA436989 
101669 L24498 



Hs.82962 
Hs.81892 
Hs.78183 
Hs.57101 
Hs.99910 
Hs.1390 

Hs.1600 
Hs.182429 
Hs.77152 
Hs.6793 



Hs.11 

Hs.99949 
Hs.1640 
Hs.37058 
Hs.21291 
Hs.1 32748 
lls.57313 
Hs.1 21686 
Hs.1 00459 
lls.2951 12 



Hs.180532 
Hs.84244 
Hs.1 12341 



AFFX control: GAPDH 
AFFX control: GAPDH 
AFFX control: GAPDH 
Human GABAa receptor alpha-3 subunit 
thymidylate synthetase 
KIAA0101 gene product 
aldo-keto reductase family 1 , member C3 
x deficient (S. 



Hs.195850 
Hs.73853 
Hs.170197 
Hs.20315 

Hs.75692 
Hs.1 54721 
Hs.99853 



Hs.2178 

Hs.121017 

Hs.80409 



proteasome (prosome, macropain) subunit, 
E2F transcription factor 3 
chaparonin containing TCP1 , subunit 5 (e 
protein disulfide isomerase-related prot 
mlnichromosome maintenance deficient (S. 
platelet-activating factor acetylhydrola 
uridine monophosphate kinase 
KIAA0175gene product 
amylase, alpha 2A; pancreatic 
RAN, member RAS oncogene family 



BE262621 Hs.73798 
Hs.82237 
Hs.2353 
Hs.84113 
Hs.82916 
Hs.878 
Hs.182505 
Hs.182018 
Hs.77348 
Hs.1901 



carcinoembryonic antigen-related cell ad 
pralactin-induced protein 
n, type VII, alpha 1(- 



mitogen-activated protein kinase kin 
Homo sapiens ribcsomal protein L39 mRNA, 
zinc ribbon domain containing, 1 
general transcription factor IIA, 1 (37k 
myeloid/lymphoid or mixed-lineage leukem 
KIAA0618 gene product 
flap structure-specific endonuclease 1 
gb:Human transketolase-like protein gene 
ret proto-oncogene (multiple endocrine n 
guanine monphosphate synthetase 
keratin 14 (epidermolysis bullosa simple 
gb:Human proliferating cell nuclear and 
glucose phosphate isomerase 
potassium voltage-gated channel, Shab-re 
protease inhibitor 3, skin-derived (SKAL 
melanoma antigen, family A, 2 
macrophage migration inhibitory factor ( 



opioid receptor, n 
cyclin-dependent kinase inhibitor 3 (CDK 
chaperonin containing TCP1 , subunit 6A ( 
sorbitol dehydrogenase 



keratin 5 (epidermolysis bullosa simplex 



gb:Human parathyroid hi 
asparagine synthetase 
aconitasel, soluble 



heparin-binding growth factor binding pr 
H2B histone family, member Q 
H2A histone family, member A 
growth arrest and DNA-damage-inducibfe, 
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101748 NM.001944 

101759 M80244 

101771 NM.0O2432 

101804 M86699 

101809 M86849 

101833 AU076442 



Hs.184601 
Hs.153837 
Hs.169840 
Hs.323733 



102002 NM.002484 Hs.81469 

-i AL134223 Hs.306098 

102072 U09410 Hs.78743 
102083 T35901 Hs.75117 
L36196 Hs.81884 
102123 NW.001809 Hs.1594 
Hs.75517 
Hs.313 
Hs.301613 
Hs.148495 
Hs.278554 
Hs.41706 
Hs.90073 
Hs.77254 
Hs.278657 
S.87539 



102154 U17760 
102193 AL036335 
102217 AA829978 



102340 U37055 
U37519 
U39817 
102394 NM_003816- 
102404 NM_005429 
102537 U57094 
102581 AU077228 
102605 AI435128 
102610 U65011 

AW249285 
102642 AA205847 
102654 AV649989 

BE245169 

U71207 



Hs.36 
Hs.2442 



Hs.30743 
Hs.37110 
Hs.23016 
Hs.24385 



Hs.29279 
Hs.29287 
102687 NMJ07019 Hs.93002 
102696 BE540274 
102768 U82321 



Hs.239 



Hs.61796 



NMJW6183 

102888 AI346201 

102892 BE440042 
NM_002275 
BE561850 

102951 X15218 

102983 BE387202 

103023 AW500470 

103036 M13509 

103038 AA926960 

10X60 NMJW5940 

103099 A1693251 

103119 X63629 

103168 X53463 

103185 NM.006825 

M22440 

BE275607 

103242 X76342 

103316 X83301 

103375 NM.005982 

103376 AL036166 
103385 NM.007069 
103391 X94453 
103404 BE394784 
103430 BE564090 
103445 X98834 

103476 Y07701 

103477 AJ011812 

103478 BE514982 
103515 Y10275 

" BE616547 

103580 AA328046 



103841 AA314821 

103847 AF219946 

103913 AW957500 

104094 AA418187 



Hs.80342 
Hs.80506 
Hs.2969 
Hs.118638 
Hs.1 17950 
Hs.83169 
Hs.334883 
I fs.15532^ 



Hs.74368 

Hs.170009 

Hs.1708 

Hs.389 

Hs.324728 

Hs.54416 

Hs.323378 

Hs.37189 

Hs.114366 

Hs.78596 

Hs.20716 

Hs.79971 



Hs.38178 
Hs.102237 
Hs.133543 
Hs.330515 



chymasel, mast cell 
bullous pemphigoid antigen 1 (230/24CkD) 
desmoglein 3 (pemphigus vulgaris antigen 
solute carrier family 7 (cationic amino 
myeloid cell nuclear differentiation anl 
TTK protein kinase 

gap junction protein, beta 2, 26kD (conn 



midkine (neurite growth-promoting factor 
nucleotide binding prolein 1 (Exoll Min 
aldo-keto reductase family 1 , member C1 
zinc finger protein 131 (clone pHZ-10) 
Interleukin enhancer binding 'factor 2, 4 
sulfotransferase family, cytosolic, 2A, 
centromere prolein A (17kD) 
laminin. beta 3 (niceln (125kD), kalinln 
secreted phosphoprotein 1 (osteopontin, 
JTV1 gene 

proteasome (prosome, macropain) 26S subu 
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DEAD/H (Asp-Glu-Ala-Asp/His) box polypep 
chromosome segregation 1 (yeast homolog) 
chromobox homolog 1 (Drosophila HP1 bela 
macrophage stimulating 1 (hepatocyte gro 
aldehyde dehydrogenase 3 family, member 
Bloom syndrome 

a disintegrin and metalloproleinase doma 
vascular endothelial growth factor C 
RAB27A, member RAS oncogene family 
enhancer of zeste (Drosophila) homolog 2 
ubiquitin fusion degradation 1-like 
preferentially expressed antigen in mela 

G protein-coupled receptor 
Human hbc647 mRNA sequence 
CUG triplet repeat, RNA-binding protein 
eyes absent (Drosophila) homolog 2 
relinoblastoma-binding protein 8 
ubiquitin carrier protein E2-C 
forkheadboxMI 

gb:Homo sapiens clone 14.9B mRNA sequenc 
chaperonin containing TCP1, subunit7 (e 
transcription factor AP-2 gamma (actival 
Homo sapiens cDNA: FU21930 Us, clone H 
neurotensin 

ubiquilin carboxyl-lerminal esterase L1 
matrix metalloproteinase 3 (slramelysin 
keratin 15 

small nuclear ribonucleoprotein polypept 
v-ski avian sarcoma viral oncogene homo) 
non-metastatic cells 1, protein (NM23A) 
multifunctional polypeptide similar to S 



NADH dehydrogenase (ubiquinone) Fe-S pro 
cadherin 3, type 1 , P-cadherin (placenta 
glutathione peroxidase 2 (gastrointestin 

le protein (63kD), endoplasmi 



chaperonin containing TCP1, subuni! 3 (g 
alcohol dehydrogenase 7 (class IV), mu o 
SMA5 

sine oculis homeobox (Drosophila) homolo 



translocase of inner mitochondrial membr 
sal (Drosopl\ila)-like 2 
aminopeptidase puromycin sensitive 
transcription factor NRF 
S100 calcium-binding protein A2 



polymerase (RNA) II [DN A directed) polyp 
5T4 oncofetal trophoblast glycoprolein 
" ng region Y)-box 2 



gb:Homo s apiens full length insert cDNA 
hypothetical proiein FLJ23468 
tubby super-family protein 
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(04150 AU22044 Hs.331633 

104257 BE550621 

104261 AW248364 

104331 AB040450 

104415 BE410992 

104558 R56678 



Hs.9222 
Hs.5409 
Hs.279862 



M360954 



104971 BE311926 

105011 BE091926 

105012 AF098158 
105026 M809485 
105076 AI598252 
105132 AA148164 
105143 AI368836 
105158 AW976357 
105175 AA305384 
105200 M328102 
105264 AA227934 
105298 BE387790 
105409 AW505076 
105460 AW296078 
105667 AA767526 
105743 BE246502 
105782 H09748 
105848 AW954064 
105891 U55984 
106019 AF221993 
106069 BE566623 
106073 AL157441 
106126 M576953 
106159 AK001301 
106220 D61329 
106260 AI097144 
106300 Y10043 
106307 AA436174 
106318 AA025610 
106341 AF191020 
106440 AA449563 
106481 D61594 
106586 AA243837 
106605 AW772298 
106654 AW0754B5 
106785 Y15227 
106813 C05766 
106895 AK001826 
106913 AI219346 
106919 AW043637 
107054 AI076459 
107059 BE614410 
107098 AI823593 
107104 AU076640 
107129 AC004770 
107198 AV657225 
107203 D20426 
107217 AL080235 
107284 NM.005629 
107318 T74445 
107516 X57152 
107529 BE515065 
107728 AA019551 
107851 AA022953 
107901 L42612 
107922 BE153855 
107932 AW392555 
103015 AW298357 
108056 AA043675 
108075 AI867370 
103187 BE245374 
108296 N31256 
108305 AA071391 
108393 AA075211 
108480 AL133092 
108554 AA084948 
108573 AA086005 
108584 AA088325 
108597 AK000292 
108695 AB029000 

108699 AA121514 

108700 AA121518 
108780 AU076442 



Hs.83623 
Hs.27268 
Hs.14846 
Hs.292911 
Hs.155924 
Hs.7010 
Hs.15830 

Hs.9329 

Hs.124219 

Hs.37810 

Hs.247280 

Hs.24808 

Hs.234545 

Hs.25740 

Hs.24641 

Hs.26369 

Hs.301855 

Hs.271721 

Hs.22030 

Hs.9598 

Hs.57987 

Hs.24951 

Hs.289088 

Hs.46743 



Hs.5243 

Hs.151393 

Hs. 17279 

Hs.57787 

Hs.21103 

lls.286049 

Hs.20149 

Hs.181022 

Hs.25245 

Hs.86178 

Hs.21766 



Hs.15 



Hs.41639 
Hs.35861 
Hs. 187958 



Hs.1 20905 
Hs.278732 
Hs.70823 
Hs.70832 
Hs.193540 
Hs.1 17938 



hypothetical protein DKFZp566N034 
estrogen receptor binding site associate 
RNA polymerase I subunit 
cdk inhibitor p21 binding protein 
heme-regulated initiation factor 2-alpha 
hypothetical protein MGC481 6 
nuclear receptorsubfamily 1, group I, m 
Homo sapiens cDNA: FLJ21933 lis, clone H 
Homo sapiens mRNA; cDNA DKFZp564D016 (fr 
ESTs, Highly similar to 36071 2 band-6-pr 
cAMP responsive element modulator 
NPD002 protein 
hypothetical protein FLJ 12691 
mitotic spindle coiied-coil related prat 
chromosome 20 open reading frame 1 
hypothetical protein FLJ 12934 
hypothetical protein MGC14833 
HBV associated factor 
ESTs, Weakly similar to I38022 hypottieti 
hypothetical protein NUF2R 
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gb:zr57e08.s1 Soares_NhHMPu_S1 Homo sapi 
hypothetical protein FLJ20287 
KGeerge svndrcme c-iiical region gene 8 
Homo sapiens, clone IMAGE417S9B6, mRNA, 
paired box gene 5 (B-cell lineage specif 
sema domain, immunoglobulin domain (Ig), 
B-cell CLL/lymphoma 1 1 B (zinc finger pro 
ESTs 

heat shock 90kD protein 1, alpha 
McKusick-Kaufman syndrome 
ESTs, Weakly similar to G02075 Iranscrip 
downstream neighbor of SON 
hypothetical protein FU13352 
hypothetical protein FLJ 10439 
mitochondrial ribosomal protein L36 
ESTs, Weakly similar to ALU1J IUMAN ALU S 
high-mobility group (nonhistone chromoso 
ESTs, Weakly similar to putative p150 [ 
cleavage and polyadenylation specific fa 
hypothetical protein, estradlol-induced 
glutamate-cysteine 'igase, catalytic sub 
tyrosylprotein sulfotransferase 1 
ESTs 

Homo sapiens mRNA; cDNA DKFZp564B076 (fr 



ESTs, Weakly similar to ALU5_HUMAN ALU S 
KIAA1272 protein 

RAD51 (S, cerevisiae) homolog (E coli Re 
ESTs 

nucleolar protein 1(1 20kD) 

flap structure-specific endonuciease 1 

KIAA1040 protein 

programmed cell death 2 

DKFZP586E1621 protein 

solute carrier family 6 (neurotransmitte 

Homo sapiens clone 2441 6 mRNA sequence 



Hs.335952 

Hs.18878 
Hs.49927 
Hs.62633 
Hs.139709 
Hs.27842 
Hs.161623 



Ig superfamlly receptor LNIR 
hypothetical protein FU21 620 
protein kinase NYD-SP15 



ESTs 

gb:zm61e06.r1 Sti _ 
gb:zm86a08.r1 Stratagene ovarian cancer 
hypothetical protein DKFZp434l0428 
gb:zn13b09,s1 Stratagene hNT neuron (937 
gb:zl84c04.s1 Stratagene colon (937204) 
Homo sapiens cDNA FLJ 11 448 ffTs, clone HE 
hypothetical protein FU 20285 
KIAA1077 protein 
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108810 AW295647 

108816 M130884 

108857 AK001468 

108860 AA1 33334 

108937 AL050107 

109010 NM.007240 Hs.44229 

109121 BE389387 Hs.49767 

109166 M219691 

109227 M766998 

109415 U80736 

109418 AI866946 

109454 AA232255 



Hs.71331 
Hs.270501 
Hs.62180 
Hs.129911 
Hs.24341 



Hs.73625 
Hs.85874 
Hs.110326 
Hs.161707 



109543 AA564994 
109648 H17800 
109680 AB037734 



109998 AL042201 

"""" H11938 

110156 AA581322 

I AA907723 
AW450381 

110561 AA379597 

110854 BE612992 

110886 AW274992 

110916 BE178102 

11003 N52980 

1 AA837396 

11434 R01608 

11439 A1476429 

111540 U82670 

11597 R11499 

11895 T80581 

11929 AF027208 

112054 R43590 

112210 R49645 

112244 AB029000 

112382 R59904 

112392 R60763 

112442 M280174 

R70318 

112772 AI992283 

112869 BE261750 

112935 R71449 

1 AA694010 

112973 AB033023 

112992 AL157425 

113063 W15573 

113073 N39342 

113078 T40444 

113238 R45467 



Hs.26090 
Hs.21273 
Hs.21907 
Hs.4213 



114073 R44953 

114162 AF155661 

114208 AL049466 

114251 H15261 



114407 BE539976 

114560 AI452469 

114699 M127386 

114767 AI859865 

114793 AA158245 
A1417215 

115047 BE270930 

115060 AF052693 

115097 M256213 

115113 M256460 

115123 M256641 

115134 AW96B073 

115291 BE545072 

115347 M356792 

115414 M662240 

115522 BE614387 

115536 AK001468 

115566 AI142336 

115645 AI207410 

115648 AW016311 



hypothetical protein MGC5350 
ESTs, Moderately similar to ALU2.HUMAN 
anillin (Drosophila Scraps homolog), act 
ESTs 

transcriptional co-activator with PDZ-bi 
dual specificity phosphatase 12 
NADH dehydrogenase (ubiquinone) Fe-3 pro 
RAB6 interacting, kinesln-like (rabkines 
Human DNA sequence from dons RP11-16L21 
trinucleotide repeat containing 9 
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Hs.211556 
Hs.222851 
Hs.7154 
Hs.4993 



109704 AI743880 Hs.12876 ESTs 



gb:yg61M3.s1 Soares infant brain 1NIB H 



Hs.142736 
Hs.19238 
Hs.9786 
Hs.1 89716 
Hs.12723 
Hs.1 12360 



Hs.193274 
Hs.285681 
lls.339730 
Hs.35437 



Hs.22908 
Hs.22265 
Hs.7859 
Hs.21948 
Hs.22974 
Hs.27946 
Hs.22790 
Hs.103305 
Hs.1 65221 

Hs.154443 



transcription factor NYD-sp10 
histone acetyltransferase 
hypothetical protein MGC16207 



HSPC1 50 protein simiarto ubiquitin-con 
hypothetical protein FLJ10607 similar to 
three-PDZ containing protein similar to 



Homo sapiens clone 25153 mRNA sequence 

prominin (mouse)-!ike 1 

gb:yc85g02s1 Soares infant brain 1NIB H 

ESTs 

KIAA1077 protein 

gb:yh07g12.s1 Soares infant brain 1NIBH 
ESTs, Moderately similar to I57588 HSrel 



Hs.268760 ESTs 

Hs.6932 Homo sapiens clone 23809 mRNA sequence 

Hs.318127 hypothetical protein FU10201 

Hs.133315 Homo sapiens mRNA; cDNA DKFZp761J1324 (f 

Hs.5027 ESTs, Weakly similar to A47582 B-cell gr 

Hs.103042 microtubule-associated protein 1B 

Hs.1 18354 CAT56 protein 

Hs.1 89813 ESTs 

Hs.200597 KIAA0563 gene product 

gb;ye53h05.s1 Soares fetal liver spleen 

Hs.243010 Homo sapiens cDNA FLJ14445 fis, clone HE 



ESTs 
ESTs 

Homo sapiens mRNA; cDNA DKFZp434B0425 (f 
ESTs 

gb:zn90d09.r1 Stratagene lung carcinoma 
'(93720 



Hs.72010 ESTs 



gb:zo76c03,s1 Sti__„_ . 
hypothetical protein FU12577 
chaperonin containing TCP1 , subunit 6A ( 
gap junction protein, beta 5 (connexin 3 



gb:zrB1a04.s1 Soares_NhHMPu_S1 Homosapi 
ESTs. Highly similar to S02392 alptia-2-m 
ESTs, Highly similar to A55713 inositol 
hypothetical protein FU10461 
hypothetical protein FLU 4825 
AF15q14 protein 
c-Myc target JP01 

anillin (Drosophila Scraps homolog), act 
Human DNA sequence from clone RP11-196N1 



Hs.236894 
Hs.194331 
Hs.122579 



Hs.234478 Homo sapiens cDNA; FLJ22648 fis, clone H 
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115B92 M291377 

115906 AI767756 

115909 AWB72527 

115965 AA001732 



116127 AF126743 

116157 BE439838 

116190 AI949095 

116278 NMJ03686 

116335 AK001100 

116496 AW450694 

116503 AI925316 

116674 AI768015 

116929 AA586922 

116973 AI702054 

116993 AI417023 

117079 H92325 

117317 AI263517 

117326 N23629 

117396 W20128 

117412 N32536 

117519 N32528 

117693 AW179019 

117721 N46100 

117881 AF161470 

117903 M768283 

117992 AI015709 

118013 A1574126 

118017 AI813444 



118472 AL157545 

118709 AA232970 

119025 BE003760 

119027 AF086161 

119052 R10889 

119164 AF221993 

119186 AI979147 

119243 T12603 

119490 M195276 



Hs.63325 
Hs.70333 
Hs.287588 
Hs.50831 
Hs.82302 

Hsj 73233 
Hs.69517 
Hs.268115 
Hs.61232 
Hs.59982 
Hs.279884 
Hs.44298 
Hs.67776 
Hs.47504 
Hs.41690 
Hs.21433 
Hs.212617 
Hs.92127 
Hs.80475 
Hs.1 66982 
Hs.40478 

Hs.43322 
Hs.241420 
Hs.296039 
Hs.42645 
Hs.146286 
Hs.112110 
Hs.93939 
Hs.260622 
Hs.47111 
Hs.172089 
Hs.94031 
Hs.42197 
Hs.42380 
Hs.166184 
Hs.48946 
Hs.48956 
Hs.42179 



h/pothetical protein FLJ23468 



119780 NMJ16625 

119845 W79123 

119941 M699485 

119994 AA642402 

120102 W67353 

120104 AK000123 



120715 AA292700 

120821 Y19062 

120859 AA826434 

120880 M360240 

120983 AA398209 

121034 AL389951 

121121 AA399371 

121313 AA402713 

121369 AW450737 

121376 M448103 

121476 AA412311 

121509 AA868939 

121563 AA412488 

121753 AK000552 

121838 AA425680 

121857 BE387162 

121991 AA430058 

122089 AW016543 

122105 AW241685 

122163 AA435702 

122318 AA429743 



122414 AI313473 



Homo sapiens cDNA FU1481 4 fis, clone NT 
ESTs, Weakly similar AP1 II If .Nil I 
hypothetical protein FLJ10970 
oDNA for differentially expressed C016 g 
ESTs, Weakly similar to T08599 probable 



ESTs, Weakly similar to T22341 hypothec 



hypothetical protein DKFZp547J036 



gb:ys85(05.s1 Scares retina N2b4HR Homo 
ESTs 

no sapiens mRNA for KIAA1 756 protein, 



butyrale-induced transcript 1 
ESTs 

Homo sapiens mRNA; cDNA DKFZp586l2022 (f 



Hs.191381 
Hs.58561 
Hs.58896 
Hs.59142 
Hs.170218 
Hs.180479 
Hs.153881 
Hs.137569 
Hs.1 04463 
Hs.97258 

Hs.96870 

Hs.1619 

Hs.97019 

Hs.97587 

Hs.271623 

Hs.189095 

Hs.97872 

Hs.1 28791 

Hs'97903 
Hs.97888 
Hs.48820 
Hs.323518 

Hs]230858 
Hs.98649 
Hs.98682 
Hs.98599 
Hs.93829 

Hs.241551 
Hs.98998 
Hs.99087 



Homo sapiens mRNA; cDNA DKFZp434K0514 (f 
hypothetical protein FLJ11808 
gb:yf38d02.s1 Scares fetal liver spleen 
McKusick-Kaufman syndrome 
hypothetical protein FU22593 
gb:CHR90123 Chromosome 9 exon II Home sa 
ESTs, Moderately similar to B34087 hypot 
ESTs 

gb:zc26d03,s1 Soares_senescent_fibrablas 

hypothetical protein 

G protein-coupled receptor 87 



KIAA0251 protein 

hypothetical protein FU20116 

Homo sapiens NY-REN-62 antigen mRNA, par 

tumor protein 63 kDawith strong homolog 

ESTs 

ESTs, Moderately similar to S29539 ribos 
gb:zs59a06.s1NCLCGAP_GCBIHomo: 
staufen (Drosophila, RNA-binding protein 
achaete-scute complex (Drosophila) homol 



ESTs, Highly similar to A35661 DNA ex 
EST 

hypothetical protein FKSG32 



rs, Weakly similar to S47073 finger pr 
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AF053305 




budding uninhibited by benzimidazoles 1 




AA449352 








AI220089 


Hs.99439 




122852 


AI580056 








AW268962 










Hs.52620 


mtegrin, beta 8 




AK001035 




B-cell CLUiymphoma 11A {zinc finger pro 




AA488687 




ESTs, Weakly s'mi'ar to f 38022 hypothe'.i 




AA496369 




gb:zv37d10.s1 Soares ovary tumor NbHOT H 








small nuclear RNA activating complex, po 












AL035414 




hypothetical protein 




AW015887 


Hs. 112574 










hypothetical protein 




AA680003 


Hs. 109363 


omo sapiens cDNA: FL 60 r.c I 




BE5501 12 




ESTs, Weakly similar to T2D3_HUMAN TRANS 




AI083986 


Hs.282977 


hypothetical protein FLJ 1 3490 








gb:ae62f01.s1 Stratagene lung carcinoma 




AA227714 


Hs. 179703 


KIAA0129 gene product 




AA621223 


Hs. 112953 






AI147155 


Hs.270016 






BE387335 


Hs.283713 






AF134160 


Hs.7327 






T96509 


Hs.248549 


ESTs, Moderately similar to S65657 alpha 






Hs.8858 






AU180215 


Hs.102301 


Homo sapiens mRNA; cUNA DKFZpoocJU^zo (f 




AW983221 




gb:EST375294 MAGE resequences, MAGH Homo 




AI360119.compHs.181013 


phosphoglycerate mutase 1 (brain) 




BE550182 


Hs.1 27826 






AK000483 


Hs.93872 


KIAA1 682 protein 




AI650360 


Hs.100256 






T58615 


Hs.110640 






AA693960 


Hs.103158 






W90022 


Hs.186809 






T32982 


Hs.102720 






AI057052 


Hs.1 33554 


ESTs, Weakly similar to Z195JHUMAN ZINC 




AA256743 


Hs.134158 


Homo sapiens, Similar to KIAA0092 gene p 




AA777690 


Hs.1 88501 




AL162066 


Hs.54320 


hypothetical protein DKFZp762D096 




AI609449 


Hs. 140197 






BE219987 


Hs.166982 


phosphatidylinositol glycan, class F 




AA305800 


Hs.5672 


hypothetical protein AF1 40225 




BE174587 


Hs.289721 


growth arrest specific transcript 5 




AI274906 


Hs.166835 


ESTs, Highly similar to 1814460A p53-ass 


125769 


BE270266 


Hs.82128 


5T4 oncofetal trophoblast glycoprotein 




AW836261 


Hs.337717 






W85858 


Hs.99804 




125875 


H14480 




gb:ym18b09.r1 Soares infant brain 1NIB H 


125924 


BE272506 


Hs.82109 


syndecan 1 




AI927475 


Hs.35406 


ESTs, Highly similar to unnamed protein 




H60340 




gb:yr39b04.r1 Soares fetal liver spleen 




M432266 


Hs.44648 






N49713 




gb:yv23!06.s1 Soares fetal liverspleen 




AW614529 


Hs.285847 


CGI-19 protein 




AA283809 


Hs.184601 


solute carrier family 7 (cationic amino 


126522 


AI475110 
W31912 


Hs.203933 


gb:zc76d03.s1 Pancreatic Islet Homo sap] 




AL035864 


Hs.59517 


cDNA for differentially expressed C016 g 




AA058394 


Hs.57887 






AA676910 






126627 


AA497044 


Hs.20887 


hypothetical protein FU10392 




N49776 


Hs.170994 


hypothetical protein MGC10946 




AW976516 


Hs.283707 


Homo sapiens cDNA: FLJ21 354 fis, clone C 




AW975076 


Hs.1 72589 


nuclear phosphoprotein similar to S. cer 




AW805510 


Hs.97056 




126892 


AF121856 


Hs.284291 


sorting nexin 6 


126928 


AA480902 


Hs.137401 






AA210954 






126986 


AI279892 


Hs.46801 






AI809521 




gb:wf30e03.x1 Soares_NFLJ_GBC_S1 Homos 




R25066 








AA347668 




gb:EST54026 Fetal heart II Homo sapiens 




AA830233 


Hs.293585 






AA305023 


Hs.81964 


SEC24 (S. cerevsiae) related gene farm! 




BE062109 


Hs.241551 


chloride channel, calcium act vated, fam 




AA315933 


Hs.120879 










Homo sap. ens cDNA FU1l45ofis, cone HE 


127444 


AW978474 


Hs.7560 


Homo sapiens mRNA for KIAA1729 protein, 


127500 


AW971353 


Hs.162115 


ESTs 


127524 


AI243596 


Hs.94830 


ESTs, Moderately similar to T03094 A-Wn 


127540 


N45572 


Hs.105362 


Homo sapiens, clone MGC:1 3257, mRNA, com 


127599 


AA613204 


Hs.150399 


ESTs 


127609 




Hs.530 


collagen, type IV, alpha 3 (Goodpasture 


127662 


W80755 


Hs.8294 


KIAA0196 gene product 


127668 


AI343257 


Hs.139993 


ESTs 
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s.120189 ESTs 



128015 221169 

128027 AI433721 

128077 AI310330 

128166 NMJ06147 

128226 AI284940 



H12912 
128572 M933022 
128777 AI878918 
128781 N71826 
128796 AJ000152 



128971 H05132 

129008 AL079648 

129041 BE382756 

129075 BE250162 

129105 AI769160 

129189 AB023179 

129229 AF013758 

129241 AI878857 

129300 W94197 

129404 AI267700 

129457 X61959 

129466 L42583 

129494 AI148976 

129605 AF061812 

129641 AI911527 

129665 AW163331 

129703 BE388665 

129720 AA156214 

129748 M16707 

129890 AI868872 

129896 BE295568 

129945 BE514376 

130010 AA301116 

130026 T40480 

130080 X14850 

130149 AW067805 

130285 AA063546 

130441 U63630 

130482 AW409701 

130500 AB007913 

130524 U89995 

130541 X05608 

130553 AF062649 

130567 AA383092 

130577 M69241 

130627 BE003054 

130648 AI458165 

130697 L29472 

130744 H59696 

130800 AI187292 

130867 NM.001072 

130869 J03626 

130925 AF093419 

130994 W17044 

131028 AI879165 

131031 NMJ01650 

131041 T15767 

131058 W28545 

131090 AI143139 

131112 H15302 

131148 AW953575 

131185 BE280074 

131200 BE540516 

131219 W25005 

131257 AW339037 

131375 AW293155 

131460 NM.003729 

131476 AI521663 

131510 BE245374 

131646 BE302464 

131786 BE000971 

131839 AB014533 

131843 AA192315 



Hs.334659 
Hs.164153 
Hs.128720 

Hs.289082 
Hs.279009 
Hs.185030 
Hs.101047 
Hs.258618 
Hs.274691 
Hs.256583 
Hs.10526 
Hs.105465 
Hs.105924 
Hs.166468 



Hs.303125 
Hs.23960 
Hs.293732 
Hs.24395 
Hs.24908 
Hs.143134 
Hs.27076 
Hs.334644 
Hs.27842 
Hs.30057 
Hs.306083 
Hs.33010 
Hs.184062 



Homo sapiens cDNA: FLJ23123 (is. clone L 
phosphatide acid phosphatase type 2A 
Homo sapiens cDNA FLJ14576 fis, clone Ml 
hypothetical protein MGC14139 



interferon regulatory factor 6 
GM2 ganglioside activator protein 
matrix Gla protein 
ESTs 

transcription factor 3 (E2A immunoglohul 



Hs.317584 

H&334309 
Hs.112062 
Hs.1 15947 
Hs.11805 
Hs.118778 
Hs.179999 
Hs.12152 
Hs.123053 



Hs.165998 

Hs.1 42838 

Hs.3321 12 

Hs.147097 

Hs.172665 

Hs.75981 

Hs.155637 

Hs.1 578 

Hs.158291 

Hs.159234 

Hs.211584 

Hs.252587 



Hs.17296 

Hs.1802 

Hs.18747 

Hs.19574 

Hs.284239 

Hs.2057 

Hs.1 69378 

Hs.327337 

lls.2227 



ESTs 
ESTs 

solute carrier family 2 (facilitated glu 
dihydrofolate reductase 
Homo sapiens brain tumor associated prot 
KIAA0962 protein 

polyadenylate binding protein-interaclin 
hematological and neurological expressed 
ribosomal protein L26 homolog 
ESTs 

aspartylglucosaminidase 



keratin 16 (focal non-epidermolytic palm 
ESTs 

KDEL (Lys-Asp-Glu-Leu) endoplasmic relic 
Homo sapiens, clone IMAGE:3457003, mRNA 
APMCF1 protein 
H4 histone, family 2 
hypothetical protein FU22704 
UDP-Gal:betaGlcNAc beta 1,4- galactosyl! 
PAI-1 mRNA-binding protein 
nucleolar phosphoprotein Nopp34 
EST 

I I2A histone family, member X 

ubiquilin specific protease 14 (tRNA-gua 
protein kinase, DNA-actlvated, catalytic 
baculovlral IAP repeat-containing 5 (sur 
KIAA0444 protein 

forkhead box Euthyroid transcription f 
neurofilament, light polypeptide (68kD) 
pituitary tumor-transforming 1 
replication protein A3 (14kD) 
insulin-like growth factor binding prate 
matrix metalloproteinase 12 (macrophage 
hypothetical protein MGC2376 



POP7 (processing of precursor, S. cerevi 

hypothetical protein MGC5469 

UDP glycosyltransferase 1 family, polype 

uridine monophosphate synthetase (orotat 

multiple PDZ domain prolein 

ESTs 

CCAAT/enhancer binding protein (C/EBP), 



RNA 3'-terminal phosphate cyclase 
hypothetical protein FLJ14668 
hypothetical protein FLJ11210 
MRS2 (S. cerevisiae)-Iike, magnesium horn 
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topoisomerase (DNA) II alpha [170kD) 
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131965 ' 

132000 AW247017 

132040 NMJ01196 

132109 AW190902 

132114 NMJ06152 

132162 AA315805 



Hs.35962 
Hs.36978 
Hs.315689 
Hs.40098 



AI752235 Hs.41270 

)2180 NM_004460 Hs.418 

12181 AW961231 Hs.16773 

12182 NM 014210 Hs.70499 
132231 AA662910 Hs.42635 
132277 AK001745 Hs.184628 
132328 NM_014787 Hs.44896 
132394 AK001680 Hs.30488 
132424 AA417878 Hs.48401 
132528 T78736 Hs.50758 
132543 BE568452 Hs.5101 

L19778 Hs.51011 

132550 AW969253 Hs.170195 

132552 BE621985 



lymphoid-restricted membrane protein 
desmoglein 2 

procollagen-lysine, 2-oxoglutarate 5-dio 
fibroblast activation protein, alpha 
Homo sapiens clone TCCCIA00427 mRNA sei 
eootropic viral integration site 2A 
hypothetical protein DKFZp434K2435 
hypotheti a protein FLJ10883 



protein regulator of cytokinesis 1 



H2Ahis 



•e family, m 



132581 

132617 AF037335 

132638 AI796870 

132653 Z15008 

132669 W38586 

132710 W74001 

132771 Y10275 

132799 W73311 

132833 U78525 



132959 AW014195 

132962 AA576635 

132990 X77343 

132994 AA112748 

AL042444 

133050 X73424 



e morphogenetic protein 7 (osteogenic 
iniopurine S-methyltransferase 
hypothetical protein FLJ20624 
carbonic anhydrase XII 
DNA segment on chromosome X (unique) 992 
laminin, gamma 2 (nicein (100kD), kalini 
guanine nucleotide binding protein (G pr 
serine (or cysteine) proteinase inhibito 



Hs.52256 
Hs.5338 
Hs.54277 
Hs.54451 
Hs.293981 
Hs.55279 
Hs.56407 
Hs.169407 
Hs.57783 
Hs.9973 
Hs.234896 

Hsi6153 
Hs.334334 
Hs.279905 
Hs.62402 
Hs.63788 
Hs.6456 
Hs.139800 
Hs.65648 
Hs.662 
Hs.66744 
Hs.254105 
Hs.73112 
Hs.7327 
Hs.73818 
Hs.73826 
Hs.74316 
Hs.74346 

133615 M62843 Hs.75236 
133627 NM.002047 Hs.75280 
133649 U25849 
133669 NM_006925 
133749 L20852 
133776 BE268649 
AB011155 
133946 AJ001258 
133973 N55540 
134047 BE262529 
134098 BE513171 

134107 NMJ05629 Hs.187958 solute carrier family 6 (neurotransmitte 



5.1 73878 



ESTs, Weakly similar to YAE6_YEAST HYPO 
CGI-48 protein 

transcription factor AP-2 alpha (activat 
clone HQ0310PRO0310p1 
p21/Cdc42/Rac1-aotivated kinase 1 (yeast 
propionyl Coenzyme A carboxylase, beta p 
chaperonin containing TCP1 , subunil 2 (b 
high-mobility group (nonhistone chromoso 
RNA binding motif protein 8A 

twist (Drosophila) homolog (acrocephalos 
enolase 1 , (alpha) 

guanine nucleotide binding protein (G pr 

ubiquinol-cytochrome c reductase hinge p 
protein tyrosine phosphatase, non-recept 
desmoplakin(DPI, DPII) 
hypothetical protein MGC14353 
ELW (embryonic lethal, abnormal vision, 
glycyl-tRNA synthetase 
acid phosphatase s soluble 
splicing factor, arginine/serine-rich 5 
solute camerfamily 20 (phosphate tran 
ADP-rlbosyltransferase (NAD+; poly (ADP- 
discs, large (Drosophila) homolog 5 
NIPSNAP, C. etegans, homolog 1 
ESTs, Weakly similar to similar to ankyr 
phosphoglycerate kinase 1 
:in L3 



134112 

134158 U15174 

134160 T98152 

134168 AA398908 

134185 AA285136 

134201 L35035 

134272 X76040 

134276 BE083936 

134353 AL138201 

134367 AA339449 

134380 AU077143 

134423 H53497 

134469 AA279661 

134470 X54942 
134498 AW246273 
134502 BE148534 
134510 NMJ02757 
134548 N95406 
134654 AK001741 



Hs.79150 
Hs.79428 
Hs.79432 
Hs.181634 
Hs.301914 
Hs.79886 
Hs.278614 
Hs.80976 
Hs.82120 
Hs.82285 
Hs.179565 
Hs.83006 
Hs.83753 
Hs.83758 
Hs.84131 
Hs.84168 
Hs.250870 
Hs.333495 
Hs.8739 



ngTCP1,si 
BCL2/adenovirus E1B 19kD-interacting pro 
fibrillin 2 (congenital contractural ara 
Homo sapiens cDNA: FU23602 fis, clone L 
neuronal specific transcription fxtor D 
ribose 5-phosphate isomerase A (ribose 5 
protease, serine, 15 

antigen identified by monoclonal antibod 
nuclear receptor subfamily 4, group A, ra 
phosphoribosylglycinamideformyltransfer 
minichromosome maintenance deficient (S, 
CGI-139 protein 

small nuclear ribonuoleoprotein polypept 
CDC28 protein kinase 2 
threonyl-tRNA synthetase 
UV-B repressed sequence, HUR7 
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134724 AF045239 Hs.321576 

I34743 AA044163 Hs.89463 

134781 AA374372 

134806 AD001528 

134853 BE2683.' 

134859 D26488 

134891 R51083 

134960 BE246400 

134993 BE409809 

135047 AL1 34197 

;35080 AI761180 Hs.94211 

135103 NMJ03428 Hs.9450 

135145 AW014729 Hs.95265 

135184 U13222 Hs.96028 

I35242 A1583187 

35286 AW023432 



PCT/US02/12476 



Hs.90787 
Hs.285176 
Hs.301005 
Hs.93597 



135355 AK001652 



Hs.9700 
Hs.97849 
Hs.9788 
Hs.99423 



:35371 NM_005025 Hs.997 



acetyl-Coenzyme A transporter 
purine-rich element binding protein B 
cyclin-dependent kinase 5, regulatory su 
rcdl (required for cell differentiation, 
zinc finger protein 84 (HPF2) 
nuclear factor related to kappa B bindin 
forkheadboxDI 
cyclin E1 



jnt component 4-binding protein, 



ABLE 5B shows the accession numbers for those primekeys lacking unigenelD's for Table 5A. For each probesel we have listed the gene cluster number from which the 
iligonucleotides were designed. Gene clusters were compiled using sequences derived from Genbank ESTs and mRNAs. These sequences were clustered based on sequence 
imilarity using Clustering and Alignment Tools (DoubleTwIst, Oakland California). The Genbank accession nui ■ ' 



Pkey: Unique Eos probeset identifier number 
CAT number; Gene cluster number 
Accession: Genbank accession numbers 



70 
75 



114793 
108305 
108393 



1703458J 
244301J 
1774795J 
!56( 133 J 



292419J 
135322J 
150742 1 
111550J 



115113 
101045 
108554 



AW963221 AA344870 AA344871 H93331 
M26958 
R49625 F10674 
H60340N91637 
U82321 H66077 
N49713N49819W03810 
R25066 R20144 R20145 Z43845 
AA347668 AW956810 Z44271 F07065 F07064 R13506 
T12603T12604 
1 114480 N98295 
R43590F10439 
AA210954AA211007 
AI809521 II12174Z42556 
AA429743 M442754 
AA127386R15644AA127404 
M158245AA158235 
AA071391 AA069892 AA069891 
M075211 AA075245 AA075126AA074946 
U14622 



113411J 
tigr_HT4586 
genbank_AA609839 
genbank_F09609 F09bu» 
genbank_AA292700 AA292700 
genbank_T97307 T97307 
genbank_AA255450 AA256460 
entrez_J05614 J05614 
genbank_AA084948 AA084948 
genbank_AA086005 AA086005 
149538J R10889R10888 
126522 416020.1 W31912 AI167491 

439280 1 AA676910AA778853AA778865 W86800 

46922_1 W42667 AI580740 AI890440 AI561350 AW467906 AW151450 AI325927 AL041716 AI885600 AI742213 AW248624 AI955498 AA033947 

AA845593 AI623711 N68583 C00064 AA193567 AW083868 AW163216 AA191595 AA522778 AIS28008 AI915518 AA843508 AI926195 
AA176265 AW167963 AA992115 W93547AW1 03572 AI862994 AI342059 M911719 AA176155 AA024712 AA069988 AA205591 AI591107 
AI199673 AI81176S AJ275832 AI422233 A1191852 AI096682 AI580124 AI683612 AA582453 AA927559 AA486415T32414 AI084978 H44849 
H44848 H20477 T91 695 W47039 AA070055 AA024795 M32S855 AA37924S AA379330 AA385580 W25920 W03688 M448359 AA093881 
AW362477 AA089997 AI350265 W93479 N99688 AA932257 AW351469 H63590 AA663402 AA069771 AW087986 AI858420 AA600214 
AI970774 AI857712 AI683081 AI885584 AW131150 AI557981 AW002714 AW189973AW075495AW168303AA953714AW516881 AI357375 
AI566663 AW512676 AI570580 AI023690 AA448216 AI079853 AI422707 M779516 AW026972 AW130082 AW1 62307 AW438646 AA709332 
AW192394 AI167350 AI217879 AI1291 52 AA719509 AI350480 AA663418 AI003634 AW118546 AA180261 AA442833 AI268625 AA888881 
AI038759 AA846723 AI248770 AA993694 Al 280335 AI885107 AW518649 AA641563 M995835 AA582521 AI276744 AA436478 AI017360 
AI620763 AI859887 N73926 AI076327 AI741 61 5 Al 1 6061 7 AW1 7281 9 AI492005 AA677429 AA996334 AI693771 AI950039 AI245629 AI28851 5 
AI866186 T93293 AA1 73252 AA599779 A1630092 AW43931 6 A:084555 AI272672 AI583507 AW473219 AA7381 32 AW473283 AI367492 
AA99541 0 AI689624 AA206353 AI033095 AI040382 AA873630 AI221074 AI934840 AI41 8680 AA844306 R94503 AA773520 AA8431 69 
AA219425 AA629658 AI81 1719 AW41 1275 AI590981 W37907 AI591 178 AI684051 AA983238 AA669347 AA976239 AA704570 AI628339 
AI884391 AI241580 AI003539 AW1 76687 M009650 N34565 Al 333493 Al 1 86070 AA070827 AA41 1 683 AI280884 AA872023 AA207255 
AA021576 N71953 AI885888 AW076039 T15777 A1537673 AW248048 H09554 W93480 W47001 AW0791 1 4 AA063160 AA757453 R60788 
AI859431 H20478 AA218882 AA757465 AA100995 AI864135 AI934209 AA070603 H47008 AA219646 W61039 W93907 AW385050 W37967 
W78Q28 AA189007 AA47913S R936S0 AA44231 2 T30287 AA84762BAA1 80262 AA009649 C03892AW149464AA310963 AA219693 
AA069747 R29207 AA094784AA293815AA447848AI984167 N90393C05097 N56499 AW292351 AW149681 AW473258 AA529322 AI004409 
AW1 05577 AI954937 AI81 1 070 AA902422 AW51 4437 M535460 AA916877 AW517122 AA974657 AA975649 AW51 71 30 AW51 71 29 F31 737 
W07688AA1 93645 AA378994 AA489273 F32267 W39303 AA021 1 81 N35810 AA406524 AA062553 AA436801 H08985 H15979 N40310 



119 



WO 02/086443 PCT/US02/12476 

AA436789 AA2321 72 AW360778 W25862 R60282 AA436530 AA37B894 M1 87461 AI940535 AA604210 AA089514 AA360421 N88243 N84281 
AA209340 N56174 N88374AA191088AW247691 M249013 M093111 AA972536 AW298594 AA375893T12139 W28186 AW243849 
AI288629AA843996W15260AI188286AW248079R15836 

119599 genbank_W45552 W45552 

112382 genbank_R59904 R59904 

105254 genbank_AA227934 AA227934 

100071 entrezj\28102 A28102 

123315 714071 1 AA496369 AA496646 



Table 6A shows 99 genes up-regulated nonsmokers with lung cancer relative to smokers with lung cancer. These genes were selected from 59680 probesets on the 
Eos/Affymetrix Hu03 Geneohip array. Gene expression data for each probeset obtained from this analysis was expressed as average Intensity (Al), a normalized value reflecting 
(lie relative level of mRNA expression. 



ExAccn: Exemplar Accession nu 

UnigenelD: Unigene number 

Unigene Title: Unigene gene title 

R1: average of Al for samples from non-smokers with adenocarcinoma divided by the 90th percentile of Al for samples from smokers with adenocarcinoma 

R2; average of Al for samples from non-smokers with squamous cell carcinoma divided by the 90th percentile of Al for samples from smokers with squamous cell 
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60 
65 
70 















BE379727 








































AA586894 
















































































AI076795 










AB002367 










AW373062 


Hs.83623 








AA055829 




ESTs Weakly similar to ALU1 HUMAN ALU 






































i is. 3^50 






































AB018549 










BE219231 










































Hs.32085 


















AW274992 








111057 


T79639 


Hs!l4629 


ESTs 


16^50 


111950 


AF071594 


Hs.1 10457 


Woff-Hirschhorn syndrome candidate 1 


11.00 


112291 


R53972 


Hs.26026 


ESTs 




112956 


Z43784 


Hs.75893 


ankyrin 3, node of Ranvier (ankyrin G) 




113009 


T23699 


Hs.7246 


ESTs 




113060 


BE564162 


Hs.250820 


hypothetical protein FLJ14827 


9.79 


113073 


N39342 


Hs.103042 


micrctubule-associated protein 1 B 


32.50 


113074 


AK001335 


Hs.31137 


protein tyrosine phosphatase, receptor t 




113121 


T48011 


Hs.8764 


EST 




113125 


AA968672 


Hs.8929 


hypothetical protein FLJ11362 


19.50 


113757 


AA703095 


Hs.18631 


ESTs 




113848 


W52854 


Hs.27099 


hypothetical protein FLJ23293 similar to 


6.00 


113884 




Hs.28529 


chromosome 12 open reading frame 2 




113936 


W17056 


Hs.83623 


nuclear receptor subfamily 1, group I, m 




114875 


AA235609 


Hs.236443 


Homo sapiens mRNA; cDNA DKFZp554N1053 ( 




114987 


AA251016 


Hs.87808 


EST 




115460 


AW958439 


Hs.38613 


ESTs 




115722 


W91892 


Hs.59609 


ESTs 




116261 


AA481788 


Hs.190150 


ESTs 


9.50 


116830 


H61037 


Hs.70404 


ESTs, Weakly similar to ALU2.HUMAN ALU 


8.50 


116970 


AB023179 


Hs.9059 


KIAA0962 protein 


7.50 


117178 


H98675 


Hs.269034 


ESTs 




117757 


AF088019 


Hs.46732 


EST 


7.50 


118283 


AA287747 


Hs.173012 


ESTs, Weakly similar to A46010 X-linked 


16.50 


118384 


AF217525 


Hs.49002 


Down syndrome cell adhesion molecule 




118657 


AI822106 


Hs.49902 


ESTs 




120328 


AA923278 


Hs.290905 


ESTs, Weakly similar to protease (H.sapi 




120404 


AB023230 


Hs.96427 


KIAA1013 protein 


7.00 


120524 


AA261852 


Hs.192905 


ESTs 


6.00 


120688 


AW207555 


Hs.97093 


Homo sapiens cDNA: FLJ23004 fis, clone L 


17.92 
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121558 AA412497 

121676 H56037 Hs.108146 

121936 AI024600 Hs.98612 

121938 AA428659 Hs.98610 

122177 AA435789 Hs.98833 

123442 AA299652 Hs.1 11496 

123551 AA608837 

123756 AA609971 Hs.112795 

123861 AA620840 

124371 N24924 Hs.188601 

127477 BE328720 Hs.280651 

127591 AI190540 Hs.131092 

128252 AA455924 Hs.192228 

128426 AI265784 Hs.145197 

128925 R67419 Hs.21851 

128945 AI990506 Hs.8077 

129105 AI769160 Hs.108681 

129235 AW977238 Hs.126084 

129506 AB020684 Hs.11217 

129595 U09550 Hs.1154 

130160 AA305688 Hs.267695 

130340 D82326 Hs.239106 

131220 AB023194 Hs.300855 

131430 AI879148 Hs.26770 

132114 NM_006152 Hs.40202 

132458 AA935315 Hs.48965 

132647 NM_006927 Hs.54432 

132655 D49372 Hs.54460 

132682 AI0775Q0 Hs.54900 

132747 M345241 Hs.55950 

132812 R50333 Hs.92186 

133337 AF085983 Hs.293676 

133876 AL1 34906 Hs.771 

134119 AW157837 Hs.79226 

134464 AA302983 Hs.239720 

134542 M14156 Hs.85112 

135002 AA448542 Hs.251677 

135305 AA203555 Hs.98288 



gb:zt95g 1 2.s1 Soares Jestis.MHT Homo sap 



gb:af89g01.s1 SoaresJestis.NHT Homo sap 



ESTs 
ESTs 

Homo sapiens cDNA FU12900 fis, clone NT 
Homo sapiens mRNA; cDNA DKFZp547E184 (fr 
Homo sapiens brain tumor associaled prot 
KIAA1055 protein 
KIAA0877 protein 



PCT/US02/12476 



solute carrier family 3 (cystine, dibasi 
KIAA0977 protein 
i binding pro 



fatty acid bin 



Homo sapiens cDNA: FLJ21693 fs, clone C 
sialyltransferase 4B (beta-galactosidase 
small inducible cytokine subfamily A (Cy 
serologically defined colon cancer antig 
ESTs, Weakly similar to KIAA1330 protein 
Leman coiled-coil protein 
ESTs 

phosphorylase, glycogen; liver (Hers dis 
fasciculation and elongation protein zet 
CCR4-NOT transcription complex, subunit 
insulin-like growth factor 1 (somatomedi 
G antigen 7B 

Homo sapiens cDNA FLJ14903 fis, clone PL 



TABLE 6B show the accession numbers for those primekeys lacking unigenclD's for Table 6A. For each 



mil using C iste rd Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers for sequences comprising each cluster are listed in the 



Pkey: Unique Eos probeset identifier number 
CAT number: Gene cluster number 
Accession: Genbank accession numbers 



Pkey CAT number Accessions 

108562 36375 1 AA100796 AF020589 AA074629 AA075946 AA100849AA085347AA1 26309 AA079311 M079323 AA085274 

103439 35330 1 X98266 N41124 

123551 genbank_AA608837 AA608837 

123861 genbank AA620840 AA620840 

102832 entrez_U92015 U92015 

101972 entrez S82472 SS2472 

121558 genbank_AA412497 AA412497 
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Table 7A shows 98 genes down-regulated in nor mo I j «r kers with I ancer. These genes were selected from 59680 probesets on the 

Eos/Affymetrix Hu03 Genechip array. Gene expression data for each probeset obtained from this analysis was expressed as average intensity (At), a normalized value reflecting 
the relative level of mRNA expression. 

Pkey: Unique Eos probeset identifier number 

ExAccn: Exemplar Accession number, Genbank accession number 

UriigenelD: Unigene number 



Pkey ExAccn 

100187 D17793 

100380 D82343 

100576 X00356 

100971 BE379727 

101046 K01160 

101066 AW970254 

101175 U82671 
W05150 

101663 NMJ03528 

101677 NMJW0715 

101745 M88700 

101941 S77583 

102125 NM_006456 

102242 U27185 

102340 U37055 

U39840 

NM_001394 

102669 U71207 

102796 AL079646 

102829 NM_006183 

103207 X72790 

103242 X76342 

103260 X78416 
X89211 

104212 AB002298 

104252 AF002246 

104258 AF007216 

105024 AA126311 

106260 AI097144 

106440 M449563 

106566 BE298210 

106605 AW772298 

106614 AA648459 

106654 AW075485 

106999 H93281 

108700 AA121518 

108810 AW295647 

108857 AK001468 



111722 R23924 
T03927 
AL1 57425 
113073 N39342 
114251 H15261 
115230 AA278300 
115291 BE545072 
115815 AW905328 
i AW872527 
115955 AA001732 
116107 AL133916 



118466 N66741 

120484 AA253170 

120983 M398209 

121034 AL389951 

121423 AW973352 

"_ ' AA451884 

122946 AI718702 
AA487200 

124472 N52517 

124526 N62098 
H49193 

125731 R61771 

• 125747 NM.002884 

' H79863 

126547 U47732 

128966 R38438 



UnigenelD Unigene Title 



Hs.36980 
Hs.37034 
Hs.2178 
Hs.1012 
Hs.150403 

Hs.288215 

Hs.82547 

Hs.278657 

Hs.299867 

Hs.2359 

Hs.29279 

Hs.107019 

Hs.80962 



Hs.173035 
Hs.210863 
Hs.5462 



Hs.21103 
Hs.335951 
Hs.286049 
Hs.10710 
Hs.193540 
Hs.71331 
Hs.62180 
Hs.293780 
Hs.12860 
Hs.12876 
Hs.28419 
Hs.23596 
Hs.293147 

Hs!l03042 
Hs.21948 
Hs.1 24292 
Hg.122579 
Hs. 180842 

Hs!l73233 
Hs.1 72572 
Hs.1 64649 



aldo-keto reductase family 1, member C3 
neuroblastoma (nerve tissue) protefn 
ca'citonin/calcitoriin-relaled polypeptid 
fatty acid binding protein 4, adipocyte 
(NONE) 

Charot-Leyden crystal protein 
melanoma antigen, family A, 2 
homeoboxA5 

H2B histone family, member Q 
complement component 4-binding protein, 
dopa decarboxylase (aromatic L-amino aci 
gb:HERVK10/HUMMTV reverse transcriptase 



102.40 
463.80 
672.00 



retinoic acid receptor responder (tazaro 
macrophage stimulating 1 (hepatocyte gro 
hepalocyte nuclear factor 3, alpha 
dual specificity phosphatase 4 
eyes absent (Drosophila) homolog 2 
symplekin; Huntingtin interacting protei 



gb:H.sapiens DNAfor endogenous retrovir 
KIM0300 protein 

cell adhesion molecule with homology to 
solute carrier family 4, sodium bicarbon 
ESTs 

ESTs, Weakly similar to ALU1 _HUMAN ALU S 
glutamate-cysteine ligase, catalytic sub 
gb:601118016F1 NIH_MGCJ7Homo sapiens c 
Homo sapiens mRNA; cDNA DKFZp564B075 (fr 
ypotheti alpr teinAF301222 



Hs.96473 
Hs.97587 
Hs.271623 
Hs.290585 
Hs.190121 
Hs.308026 

Hs.1 02670 

Hs.293185 

Hs.124984 

Hs.26912 

Hs.865 

Hs.1 14243 

Hs.84072 

Hs.182575 



hypothetical protein FLJ20417 

ESTs, Moderately similar to 21 09260A B c 

hypothetical protein MGC5350 

anillin (Drosophila Scraps homolog), act 



ESTs, Moderately similar to A46010 X-li 
Homo sapiens mRNA; cDNA DKFZp751J1324 (f 
microtubule-associated protein 1B 
ESTs 

Homosa[ is cDNA Fl 1231 23 fis clonal 
hypothetical protein FU 10461 
ribosomal protein L1 3 

ESTs, Weakly similar to DAP1.HUMAN DEATH 
hypothetical protein FU 10970 
hypothetical protein FU20093 
hypothetical protein DKFZp434H247 
gb:HUMGS02848 Human adult I ung 3' direct 
gb:yz33g08.s1 Morton Fetal Cochlea Homo 
EST 
EST 



RAP1A, member of RAS oncogene family 



er family 15 (HWpeptide tra 
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128233 AW889132 
128420 AA650274 
AW1 60432 
129014 AW935187 



'30385 AW067800 
I30732 AW890487 
131025 AB040900 



Hs.192013 
Hs.150271 
Hs.1 80138 
Hs.1 24511 
Hs.11916 
Hs.41296 



Hs.63984 
Hs.6189 
Hs.24654 
Hs.31921 
Hs.42676 



I32856 NM 001448 Hs.58367 
32977 AA093322 Hs.301404 



Hs.7645 
Hs.8087 
Hs.80876 



35047 AL134197 Hs.93597 
I35056 N75765 Hs.93765 
I35309 AI564123 Hs.42500 



ESTs 
ESTs 
ribokinase 

Sbroneolin leucine rich transmembrane p 
craniofacial development protein 1 
KIAA1357 protein 
KIAA1497 protein 
i cfinge urotein 36 (KOX18) 



Homo sapiens cDNA FLJ11640 lis, clone HE 76.20 



glypioan 4 

RNA binding motif protein 3 
solute carrier family 20 (phosphate tran 
fibrinogen, B beta polypeptide 
NAG-5 pptein 

flavin containing monooxygenase 3 
TATA box binding protein (TBP)-associate 
lysosomal-associated membrane protein 2 
cyclin-dependent kinase 5, regulatory su 
lipoma HMGICfusion partner 
ADP-ribosylation factor-like 5 



TABLE 7B shows the accession numbers for those primekeys lacking unigenelD's for Table 7A. For each probesel we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from Genbank ESTs and mRNAs. These sequences were clustered based on sequence 
similarity using Clustering and Alignment Toots (DoubleTwist, Oakland California). The Genbank accession numbers for sequences comprising each cluster are listed in the 
"Accession" column. 



Unique Eos probeset identifier number 




X72790 

BE298210 AI672315 AW086489 BE298417 AA455921 AA902537 EE327124 R14963 AA085210 AW274273 AI333584 AI369742 AI039558 
AI885095 AI476470 AI287650 AI885299 AI985381 AW592624 AW34C13B A1255556 M456390 AI310B1 5 M484951 
genbank_D45652 D45652 
genbank JI66741 N66741 
entrez_K01160K01160 
entrez_S77583S77583 
entrez_X8921 1 X89211 
genbank_AA487200 AA487200 
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Table 8A shows 1720 genes either up or down-regulated in lung tumors or chronically diseased lung relative to a broad collection of over 40 distinct normal body tissues. 
Chronically diseased lung samples represent chronic non-malignant lung diseases such as fbrosis, emphysema, and bronchitis. These genes were selected from 39494 
probesets on the Eos/Affymetrix Hu02 Genechip array, Gene expression data for each probeset obtained from this analysis was expressed as average Intensity (Al), a 
normalized value reflecting the relative level of mRNA expression. 

Pkey: Unique Eos probeset identifier number 

ExAccn: Exemplar Accession number, Genbank accession number 

UnigenelD: Unigene number 

Unigene Title: Unigene gene title 

R1: 70th percentile of Al for lung tumors divided by 90th percentile of Al for normal lung 

R2: 70th percentile of Al for chronically diseased lung divided by 90th percentile of Af for normal lung 
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AW1 89787 




























































AI469095 






























































































AW1 35830 




































AW1 18822 
















hypothetical protein FU23393 








AI623332 












AA235361 


Hs.96840 


KIAA1527 protein 






















Hs.220615 


































AW449802 


Hs.285901 










AI890356 
























AI041019 












AW204069 


Hs.312716 


ESTs, Weakly similar to unnamed protein 








AA593373 












AA565209 


Hs.269439 










AW450840 


Hs. 148590 










AI927208 


Hs.208952 






















AA677570 


Hs. 185918 


























































AA758115 


Hs. 128350 






















M843986 


Hs. 190586 










AI819198 


Hs. 208229 






















AW450466 


Hs. 126830 












Hs, 159955 






















AI678034 


Hs.131099 










AI733621 


Hs. 133011 


zinc finger protein 1 1 7 (HPF9) 








AI077462 










301580 


AI878959 


Hs.73737 


splicing factor, arginine/serine-rtch 1 


7.41 


11.92 


301676 


Z43570 


Hs.27453 


ESTs, Moderately similar to G01251 Rar p 


8.31 


10.70 


301690 


F05865 


Hs,108323 


ublqultin-conjugating enzyme E2E 2 (homo 


2.70 


4.22 


301718 


F07744 


Hs.7987 


DKFZP434F162 protein 


4.20 


S.78 


301799 


AA384252 


Hs.286132 


D15F37(pseudogene) 


5.93 


7.04 


301804 


AA581004 


Hs.62180 


anillin (Drosophila Scraps homolog), act 


1.70 


0.76 


301822 


X17033 


Hs.271986 


integrin, alpha 2 (CD49B, alpha 2 subuni 


1.58 


1.36 


301846 


R20002 


Hs.6823 


hypothetical protein FLJ1 0430 


1.00 


1.00 


301868 


T71508 


Hs.13861 


ESTs, Weakly similar to pH sensitive max 


2.88 


5.49 


301882 


T78054 




gb:yc97g09.M Soares infant brain 1 NIB H 


2.28 


3.80 


301905 


AI991127 


Hs.1 17202 


ESTs 


1.00 


1.00 


301948 


AA344647 


Hs.116724 


aldo-keto reductase family 1, member B1 1 


5.28 


2.28 


301960 


AW070252 


Hs.2797; 


KIAA0874 protein 




6.43 


302011 


T91418 




transcriptional adaptor 2 (ADA2, yeast, 


3'.03 


3.42 


302016 


N40834 


Hs!23495 


hypothetical protein FU11252 


1.00 


1.25 


302041 


NMJ01501 


Hs.129715 


gonadotropin-releasing hormone 2 


0.71 


0.99 


302072 


AJ238381 


Hs.1 32576 


paired box gene 9 


1.60 


1.71 


302094 


AI286176 


Hs.6786 


ESTs 


0.52 




302095 


AW044300 


Hs.1 37506 


Homo sapiens BAC clone RP1 1-120J2 from 7 


2.75 




302148 


AW269618 


Hs.23244 


ESTs 


3.04 


337 



124 



Hs.166361 
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302155 AI088485 Hs.144759 

302201 AJ006278 Hs. 159003 

302202 AF097159 Hs.159140 
302206 AI937193 Hs.41143 
302209 AF047445 
302235 AL049987 
302290 AL1 17607 Hs.175563 
302328 AA354849 Hs.23240 
302346 AL039101 Hs.194625 
302360 AJ010901 Hs.198267 
302384 Y08982 Hs.202676 
302406 U86751 Hs.211956 
302409 AF155156 Hs.218028 
302423 AB028977 Hs.225974 

Hs.272534 
Hs,227277 

302455 AA356923 Hs!240770 

302472 AA317451 Hs,6335 • 

302476 AF182294 Hs.241578 

302489 T80660 Hs.230424 

302490 AA885502 



302435 AF092047 



302705 
302711 
302719 
302742 
302755 
302771 
302789 
302795 

302802 Y08250 

302803 AA442824 
302812 N31301 
302847 X98940 
302885 AL1 37763 
302943 AI581344 
302977 AW263124 
303006 AF078950 
303011 AF090405 
303013 F07898 
303061 AF151882 
303077 AF163305 

303090 AA443259 

303091 AF192913 

303094 AF195513 

303095 AF202051 
303131 AW081061 

303195 AA082211 

303196 AA082298 
303216 AA581439 



303756 
303856 
303893 



ESTs 

transient receptor potential channel 6 
UDP-Gal:betaGlcNAc beta 1 ,4- galactosyl! 
pht .Mk'hosi'jde-specificphospholipase 
killer cell lectin-like receptor subfami 
Homo sapiens mRNA; cDNA DKF2p564F1 1 2 (fr 
Homo sapiens mRNA; cDNA DKF2p564N0763 (f 
Homo sapiens cDNA FU13496 (is, clone PL 
dynein, cytoplasmic, light intermediate 
mucin 4, tracheobronchial 
synaptonemal complex protein 2 
CD3-epsilon-associated protein; antisens 
adaptor-related protein complex 4, epsit 

Homo sapiens mRNA; cDNA DKFZp564J062 (fr 
sine oculis homeobox (Drosophila) homolo 
UDP-N-acetylglucosamine:a-1,3-D-mannosid 
nuclear cap binding protein subunit 2, 2 
SWI/SNF related, matrix associated, acli 



PCT/US02/12476 



Hs,48956 gap junction protein, beta 6 (connexin 3 
Hs.248572 " " 

Hs.272100 
Hs.173560 



302647 X57723 Hs.198273 

302655 AJ227892 Hs,146274 

302656 AW293005 Hs.70704 
302668 AA580691 Hs.180789 

302679 H65022 

302680 AW192334 Hs.38218 
AJ001408 
U09060 
L08442 

W69724 Hs.288959 
L12069 

AW384815 Hs.149208 

H98476 Hs.42522 
AJ245067 

AJ245313 Hs.272838 



SMS3 protein 

odd 02/ten-m homolog 2 (Drosophila, mous 
MCT-1 protein 
NADH dehydrogenase (ubiquinone) 1 beta s 
ESTs 

Homo sapiens, clone IMAGE;2823731 , mRNA, 
S164 protein 

gb;yu66g11.r1 Weizmann Olfactory Epithel 



Hs.132127 
Hs.1 27812 
Hs.315111 



Hs.27693 

Hs.146286 
Hs.130683 
Hs.278953 
Hs.1 34079 
Hs.103180 



gb:Human autonomously replicating sequen 

hypothetical protein FU20920 

gb:Homo sapiens (clone WR4.10VH) anti-th 

KIAA1555 protein 

ESTs 

gb:Homo sapiens mRNA for immunoglobulin 
hypothetical protein FU10494 
gb:H.sapiens mRNA for variable region of 
ESTs, Moderately similar to putative DNA 
hypothetical protein FU20051 
gb:H.sapiens rearranged Ig heavy chain ( 
hypolhetical protein LOC57822 
ESTs, Weakly similar to T17330 hypotheti 
hypothetical protein FU12894 
Homo sapiens cDNA: FLJ23137 fis, clone L 
gbrHomo sapiens clone 2A1 scFV anitbody 
RAB22A, member RAS oncogene family 
peptidylprolyl isomerase (cyclophilin)-l 
gb:H.sapiens T-cell receptor mRNA 



sir, family me 



:er13A 



zinc finger protein 180 (HHZ168) 
Pur-gamma 
NM23-H8 
DC2 protein 

myosin, light polypeptide, regulatory, n 



AA132255 Hs.143951 

AW340037 Hs.1 15897 

AA205625 Hs.208067 

T80072 Hs.1 3423 

AF033122 Hs.14125 

AA398801 Hs.323397 

AA340605 Hs.105887 

AA359799 Hs.224662 
AA382814 

AF056083 Hs.24879 

AA504702 Hs.258802 



hypothetical protein FU10534 



AI738488 Hs.1 15838 ESTs 



ESTs, Weakly similar to Homolog of rat Z 
ESTs, Weakly similar to unnamed protein 
gb:EST96097 Testis I Homo sapiens cDNA 5 
phosphatide xid phosphatase type 2C 
ATPase, (Na+)/K+ transporting, beta 4 po 



AA968589 Hs.180532 

N88597 Hs.1 13503 

AW467774 Hs.171880 

AW474196 Hs.306637 
AW513315 

AW513804 Hs.278834 

AW515465 

AW516449 

AW516611 

AW517947 



karyopherin (importin) beta 3 
polymerase (RNA) II (DNA directed) polyp 
I j c cf 1 U 1 M 
gb:xo43c12.x1 NCI_CGAP_Ut1 Homo sapiens 
ESTs, Weakly similar to ALU1_HUMAN ALU S 

11.X1NCI "A i Homo 
gb:xtS8f05.x1 NCI CGAP_Ut2 Homo sapiens 
gb:xp70b11.x1 NCI_CGAP_Ov3S Home sapiens 
. t ,0 <1 NCLCGAPJJK Homo sapiens 
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304008 AW518198 Hs.3297 

304009 AW518206 Hs.181165 
304024 T03036 



304097 R25376 

304114 R78946 

304122 H28966 

304155 H68696 



304234 W81608 

304267 AA064862 

304270 M069711 

304287 AA079286 

304348 AA1 79868 

304415 AA290747 

304430 AA347682 

304456 AA411240 

304521 AA464716 

304526 AA476427 

304542 AA482602 

304546 AA486074 

304607 AA513322 

304640 AA524440 

304650 AA527489 

304735 AA576453 

304760 AA580401 

304849 AA588157 

304917 AA602685 

304921 AA603092 

304966 AA613893 

304987 AA618044 

305016 AA626876 

305034 AA630128 

305072 AA641012 

305111 AA644187 

305148 AA654070 

305159 AA659166 

305190 AA665955 

305232 AA670052 

305235 AA670480 

305245 AA676695 

305312 AA700201 

305322 AA701597 

305394 AA720942 

305413 AA724659 

305447 AA737856 

305476 AA745664 

305483 AA748030 

305528 AA769156 

305612 M782347 

305614 M782866 

305616 M782884 

305637 AA806124 

305639 AA806138 

305650 M807709 

305690 AA813477 

305726 AA828156 

305728 AA828209 

305759 AA835353 

305792 AA845256 

305864 AA864374 

305901 AA872968 

305910 AA875981 

306015 AA897116 

306017 AA897221 

306020 AA897630 

306063 AA906316 

306065 AA906725 

306104 AA910956 

306109 AA911861 

306148 AA917409 

306242 AA932805 

306288 M936900 

306325 AA953072 

306353 AA961382 

306375 AA968650 

306396 AA970223 

306428 AA975110 

306442 AA976899 

306446 AA977348 



Hs.73742 
Hs.297753 
Hs.78466 



nbosomal protein S27a 
eukaryolic translation elongation factor 
gb:FB21E7 Fetal brain, Stratagene Homo s 
gb:FB26F2 Fetal brain, Stratagene Homo s 
gb:FB7C1 Fetal brain, Stratagene Homo sa 
ribosomal protein S14 
gb:yb42d06.s1 Stratagene fetal spleen (9 
gb:yb73g01.s1 Stratagene ovary (937217) 
gb:yc04c12.s1 Stratagene lung (937210) H 
ribosomal protein, large, P1 
gb:yi87g02.s1 Soares placenta Nb2HP Homo 
gb:ym31a06.s1 Soares infant brain 1NIB H 
gb:yr78b06.s1 Soares fetal liver spleen 
gb:yy82d08.s1 Soares.multiple.sclerosis. 
gb:zd88h06.s1 Soares_fetal_heart_NbHH1 9W 
ribosomal protein, large, PO 



PCT/US02/12476 



Hs. 13801 
Hs.284136 
Hs.297753 



gb:EST54044 Fetal heartll Ho 
gb:zv26g05.s1 Soares_NhHW,Pu_S1 Homo sapi 
gb:zx82d 1.s1 Soares ovary tumor NbHOT H 
gb:zx02c05.s1 Soares _totsl_fetus_Nb2HF8_ 
glyoeraldehyde-3-phosphate dehydrogenase 
serine (or cysteine) proteinase inhibito 
gb:nh85e08.s1 NCI_CGAP_Br1.1 Homosapien 
ferritin, light polypeptide 
ribosomal protein S23 

gb:nm75h11.s1 NCI_CGAP_Co9 Homo sapiens 
lb i I3g09.s1 NCI_CGAP_Cq12 Homo sapiens 
KIAA1 685 protein 
PRO2047 protein 



immunoglobulin heavy constant gamma 3 (G 
gb:zu89h06.s1 Soaresjestis NHT Homo sap 
: i 1 i t i I r 3721 i 
gb:nr72a12.sf NCLCGAP_Pr24 Homo sapiens 
ESTs 

gb:nt01g08.s1 NCI_CGAP_Lym3 Homo sapiens 
EST, Weakly similar to EF1DJIUMAN ELONG 
gb:ag57d12,s1 & 



gb:ag37e01 ,s1 Jia bone marrow stroma Horn 
nuclear factor of kappa light polypeptld 
gb:zj44f07.s1 Soares Jetaljiver.spleen. 



Hs.303405 
Hs.275668 
Hs.169476 
Hs.81328 



gb;ai10f08.s1 Soares_parathyroid_Uii _ 

gb:nx10c08.s1 NCLCGAP_GC3 Homo sapiens 2.21 

Hs.287445 hypothetical protein FU 1 1 726 3.36 

Hs.303512 EST 1.00 

gb:nz12e05.s1 NCI.CGAP.GCB1 Homo sapiens 6.44 

Hs.272572 hemoglobin, alpha 2 0.19 

gb:aj09h02.s1 Soares jarathyroidJumor.N 1.00 

Hs.275865 ribosomal protein S18 7.57 

gb:oe29a12.s1 NCI_CGAP_Pr25 Homo sapiens 4.78 

gb:oe29c12.s1 NCLCGAP_Pr25 Homo sapiens 0.89 
gb:nw31e04.s1 NCLCGAP.GCBO Homo sapiens4.49 

gb:ai67a05.s1 Soares_testis_NHT Homo sap 4.91 

Hs.73742 ribosomal protein, large, P0 0.19 

gb:of34a02.s1 NCI CGAP _Kid6 Homo sapiens 5.12 

gb:ak72b05.s1 Barstead spleen HPLRB2 Horn 1.66 

gb:ak84a08.s1 Barstead spleen HPLRB2 Horn 2.34 

Hs.73742 ribosomal protein, large, PO 0.30 

gb:oh63h08.s1 NCI_CGAP_Kid5 Homo sapiens 2.10 

gb:nx21 h02.s1 NCI_CGAP_GC3 Homo sapiens 0.32 
gb:am08b07.s1 Soares_NFL_T_GBC_S1 Homos1.56 

Hs.109058 ribosomal protein S6 kinase, 90kD, polyp 5.21 

Hs.1 30027 EST 1.95 

gb:ok03g03.s1 Soares_NFl_T_GBC_S1 Homos 7.38 

gb:ok78g02.s1 NCI_CGAP_GC4 Homo sapiens 7.19 

gb:ok85h11.s1 NCI_CGAP_Kid3 Homo sapiens 6.50 

gb'.og21a07.s1 NCI_CGAP_PNS1 Homo sapiens 4.21 

Hs, 288036 tRNAisopentenylpyrophosphate transferas 2.20 

gb:oo60g04.s1 NCI.CGAP.Lu5 Homo sapiens 2.84 

gb:oi53h05.s 1 NCI.CGAP.HN3 Homo sapiens 1.60 

lis, 210546 interleukin 21 receptor 1.65 

Hs.275865 ribosomal protein S18 . 3.78 

Hs.276018 EST, Moderately similar J,- - ds 4.30 

gb:op09d05,s1 NCI.CGAP.Kid6 Homo sapiens 0.95 

Hs.191228 hypothetical protein FLJ20284 3.19 

gb:oq35e09.s1 NCI.CGAP.GC4 Homo sapiens 4.67 

gb:oq72e12.s1 NCI_CGAP_Kid5 Homo sapiens 3.92 
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306458 AA978186 

306467 AA983508 Hs.1 63593 

306510 AA988546 

306555 AA994304 Hs.276083 

306557 M994530 

306572 AA995686 

306582 AA996248 

306598 AI000320 Hs.169476 

306605 AI000497 

306656 AI004024 

306676 AI005603 

306686 AI015615 

306702 AI022565 

306728 AI027359 

306751 AI032589 

306767 AI038963 



gb:op33c06.s1 Soares_NFL_T_GBC_S1 Homos 
ribosomal protein L18a 

gb:or84d07.s1 NCI_CGAP_Lu5 Homo sapiens 
EST, Weakly similar to RL23.HUMAN 60S R 
gb:ou57eC8.s1 NCI_CGAP_Br2 Homo sapiens 
gb:os25c12.s1 NCI_CGAP_Kid5 Homo sapiens 
gb:os18c10.s1 NCI_CGAP_Kid5 Homo sapiens 
glyceraldehyde-3-phosphate dehydrogenase 
ribosomal protein, large P2 
gb:ou11b07.x1 Soares_NFLT_GBC_S1 Homo s 

gb:ov29f10.x1 SoaresJestisJIHT Homo sap 



PCT/US02/12476 



306956 AI125111 

306958 AI125152 

307035 AI142774 

307041 AI144243 

307091 AI167439 

307181 AI189251 

307297 AI205798 

307317 AI208303 

307327 AI214142 

307382 AI223158 

307410 AI241715 

307415 AI242118 

307423 AI243206 

307426 AI243364 

307517 AI275055 

307551 AI281556 

307561 AI282207 



307691 AI318285 

307701 AI318583 

307718 AI333406 

307730 AI336092 

307760 AI342387 

307764 AI342731 

307783 AI347274 

307796 AI350556 

307807 AI351799 

307808 AI351826 
307820 AI355761 
307830 AI358722 
307852 AI365541 
307902 AI380462 
307997 AI434512 
308002 AI435240 
308011 AI439473 
308023 AI452732 
308041 AI458824 
308059 AI468938 
308085 AI474135 
308101 AI475950 
308106 AI476803 
308122 AI480123 
308154 AI500600 
308171 AI523632 
308211 AI557029 
308213 AI557041 
308216 AI557135 
308219 AI557246 
308271 AI567844 
308319 AI583983 
308362 AI613519 
308413 AI636253 
308450 AI660860 
308464 AI672425 
308588 AI718299 
308599 AI719893 
308615 AI738593 
308643 AI745040 
308673 AI760864 
308697 AI767143 
308762 AI807405 
308778 AI811109 
308782 AI811767 
308808 AI818289 
308823 AI824118 



Hs.249118 ESTs 



gb»w70h12.s1SoareaJetaUI»er_spteen_ 



gb:qa75h12.x1 Soares_fetal_heart_NbHH19W 
gb:qa33c06,s1 Soares_NhHMPu_S 1 Homosapi 
gb:am66f03.s1 Barstead spleen HPLRB2 Horn 



lal protein L13a 

gb:qb85b12.x1Soares_fetaLheart_NbHH19W 
gb:ox7Ch06.s1 Soares_NhHMPu_S1 Homosapi 



Hs.111334 ferritin, light polypeptide 
Hs.1 47333 " 
Hs.246381 
Hs.1 47885 

Hs.77039 ribosomal protein S3A 



ESTs 



gb:qh92b02.x1 Soares_NFL_T_GBC_S1 Homos 
collagen, type I, alpha 2 

gb:qh30g11.x1 Soares_NFL_T_GBC_S1 Homos 
gb:ql72d03.x1 Soares_NhHMPu_S1 Homo sapi 
gb:qu52f1 1 ,x1 NCI_CGAP_Lym6 Homosapiens 
qb:qp65a12.x1 fetal Inn N H 

gh:qm01f02.x1 Soares„NhHMPu_S1 Homosapi 



Hs.309411 EST 



gb:qt43b07.x1SoaresJetalJung_NbHL19W 
gb:qt27f07.x1 Soares_pregnant_Liierus_NbH 
gb:qo26a07.x1 NCI_CGAP_Lu5 Homo sapiens 
gb:tc05d02.x1 NCLCGAP Co16 Homo sapiens 
gb:qt1 8f09.x1 NCI_CGAP_GC4 Homo sapiens 
gb:qt09d02jc1 NCI_CGAP_GC4 Homo sapiens 
gb:qt09g03.x1 NCLCGAP_GC4 Homo sapiens 
gb:qt94a1 1 .x1 NCI_CGAP_Co1 4 Homo sapiens 
EST, Weakly similar to R5HU22 ribosomal 
gb:qz08gC5.x1 NCLCGAP JIL1 Homosapiens 
gb:tg02h05.x1 NCI_CGAP_CLL1 Homosc--'"" 
eukaryotic translation elongation factor 
ESTs 

gb:ti60a08.x1 NCI_CGAP_Lym12 Homosapien 
hemoglobin, alpha 1 

glyceraldehyde-3-phosphate dehydrogenase 
EST, Weakly similar to RL10_HUMAN 60S R 
eukaryotic translation elongation factor 
eukaryotic translation elongation factor 
gb:tj77e12.x1 Soares_NSF_F8_9W_OT_PA_P. 



Hs.1 81 165 
Hs.1 05749 
Hs.1 96511 
Hs.96840 
Hs.277117 



Hs.259408 

Hs.2186 

Hs.217493 



gb:tn93d08.x1 NCI_CGAP_Ut2 Homo sapiens 
ESTs, Weakly similar to schlafen4 [Mrai 
anaplastic lymphoma kinase (Ki-1) 
gb:PT2.1 12 E04.rtumor2 Homo sapiens cD 
gb:PT2.1_13_H06.rtymor2Homo sapiens cD 
gb:PT2.1_15_D07.r tumor2 Homo sapiens cD 
ribosomal protein S3 
eukaryotic translation elongation factor 
KIM0553 protein 
ESTs 

KIAA1527 protein 

EST, Moderately similar to I38055 myosi 
gb:as51g12.x1 Barstead aorta HPLRB6 Homo 
gb:as47d07.x1 Barstead aorta HPLRB6 Homo 
hypothetical protein FLJ23045 
gb:tr19a12.x1 NCI_CGAP_Ov23 Homosapiens 

b:w'09c10.x1 NCI P.CLL1 Homosapien 
gb:wi97a07.x1 NCLCGAP_Kid12 Homo sapien 
ESTs 

gb:W)4c11.x1 NCI_CGAP_Ov23 Homosapiens 
eukaryotic translation elongation factor 
gb:v*52c01.x1 NCI_CGAP_Pr22 Homosapiens 

gb:a!48g03.x1 Barstead colon HPLRB7 Homo 
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thymosin, beta 4, X chromosome 




AI833240 




gb:at76d10.x1 Barstead colon HPLRB7 Homo 


308898 


AI858845 




gb:wl32d10.x1 NCI.C6AP.Wl Homo sapiens 


308934 


AI865023 




phosphatidylirositol glycan, class H 


308966 


AI870704 




gb:wi47h01.x1 NCI CGAP Ut1 Homo sapiens 








gt 52h <1 NCI GAP n2 'un f 




AI910902 




gb:tq39101.x1 NCI CGAP_Ut1 Homo sapiens 




AI911975 




gb:wd78d01.x1 NCI CGAP_Lu24 Homo sapiens 


309069 


AI917366 


Hs.78202 


SWI/SNF related, matrix associated, act 




AI922426 


Hs.1 19598 


ribosomal protein L3 




AI925503 


Hs.265884 


ESTs 




AI928178 




gb:wo95a11.x1 NCI_CGAP_Kid1 1 Homosapien 




AI928816 




ribosomal protein L1 3 




AI937761 




gb:wp84b09.x1 NCI CGAP_Brn25 Homo sapien 








gb:wx63g05.x1 NCI CGAP Bt18 Homo sapiens 




AI991525 


Hs .299426 


ESTs 




AW003478 




gb:wq66c06.x1 NCLCGAP.GC6 Homo sapiens 




AW004823 




gb:ws93a08.x1 NCI_CGAP_Co3 Homo sapiens 




AW085201 




EST 




AW090702 


Hs.278242 


tubulin, alpha, ubiquitous 




AW1 17645 




keratin 18 




AW129368 




gb:xe14b05.x1 NCI_CGAP_Ut4 Homo sapiens 




AW136325 


Hs.279771 


Homo sapiens clone PP1596 unknown mRNA 




AW150807 


Hs.181357 


laminin receptor 1 (67kD, ribosomal pro 




AW151119 




gb:xg33e10.x1 NCI_CGAP_Ut1 Homo sapiens 




AW1 92004 


Hs.297681 


serine (or cysteine) proteinase inhibit 




AW1 94230 


Hs. 2531 00 


EST, Moderately similar to GHHU Ig gamm 




AW205681 


Hs.253506 


EST, Moderately similar to ATPN.HUMAN A 


309693 


AW237221 




laminin receptor 1 (67kD, ribosomal prol 




AW23801 1 


Hs.295605 


mannosidase, alpha, class 2A, member 2 


309700 


AW241170 


Hs.179661 


tubulin, beta polypeptide 




AW264889 




gb:xq36h02.x1 NCI CGAP Lu28 Homo sapiens 


309769 


AW272346 




gb:xs13c10.x1 NCI_CGAP_Kld11 Homosapien 


309782 


AW275156 


Hs.156110 


immunoglobulin kappa constant 


309783 


AW275401 


Hs.254798 


EST 


309799 


AW276964 




gb:xp58h01.x1 NCI_CGAP_Ov39 Homo sapiens 




AW299916 




gb:xs44c01.x1 NCI_CGAP_Kid11 Homosapien 




AW339071 


Hs.300697 


immunoglobulin heavy constant gamma 3 (G 




AW340684 




gb:hd05g08.x1 Soares_NFL_T_GBC_S1 Homo s 




AW341418 




gb:hd08c03.x1 Soarcs NFL T GBC S1 Homos 




AW341683 




gb:hd13d01.x1 Scares NFLT GBC.S1 Homos 




AW341936 




gb:hb73f10.x1 NCI CGAP_Ut2 Homo sapiens 




AW449111 




hypothetical protein MGC3265 




Al 439096 


Hs.323079 


Homo sapiens mRNA; cDNA DKFZp564P1 1 6 (fr 




AW1 36822 


Hs.172824 


ESTs, Weakly similar to B48013 proline-r 






Hs.161354 


ESTs 




AI203094 




ESTs 




AW1 97233 


Hs.1 47253 


ESTs 






Hs.223796 


ESTs 




AW195642 




ESTs 




AI206614 


Hs.1 97422 


ESTs 




AI627653 


Hs.1 47562 


ESTs 




AW450439 


Hs.1 53378 


ESTs 


310261 


AI240483 


Hs .201217 


ESTs 








metallothionein 1E (functional) 




AI242102 




ESTs 




AI243332 


Hs.1 56055 


ESTs 




AW013815 




ESTs 




AI253200 




ESTs 




AI261340 


Hs.1 455 17 


ESTs 


310385 


A! 263392 




ESTs 






Hs.164231 






AW1 96632 


Hs.252956 


ESTs 






Hs.1 45926 


ESTs 


310468 


AI984074 


Hs.196398 


ESTs 




AI948801 


Hs.171073 






AW275603 


Hs.20071 2 


ESTs 


310514 


Al 681145 


Hs.160724 


ESTs 




AW082270 


Hs.12496 


ESTs, Highly similar to AC004836 1 simil 




AI302654 


Hs.208024 


ESTs 








ESTs 






Hs.1961 02 


ESTs 








gb:Human endogenous retrovirus H proteas 






Hs, 164175 


ESTs 




Al 347863 


Hs.156672 


ESTs 




A! 654370 


Hs.1 57752 


Homo sapiens mRNA full length insert cDN 




AI472124 


Hs. 157757 


ESTs 


310714 






ESTs 


310722 


AI989803 


Hs.157289 


ESTs 


310756 


AI916560 


Hs.158707 


ESTs 


310764 




Hs.167172 


ESTs 


310848 


AI459554 


Hs.161286 


ESTs 


310851 


AW291714 


Hs.221703 


ESTs 


310854 


AI421677 


Hs.161332 


ESTs 


310858 


AI871 000 


Hs.161330 


ESTs 
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Hs 132817 














ESTs Moderately similar to ALU7 HUMANA 




























































































































































































AW072813 


Hs.270868 






























































































AI791521 












































































































AW449774 
















































AI922143 


















































Hs.213081 


























































AW131785 
















Homo sapiens cDNA FLJ12981 fis clone NT 
































AI056769 


Us. 1335 12 














ESTs Weakly similar to KIAA0973 protein ' 












































AA522738 


Hs. 132554 














gb'UI-H-BI1-afg-g-02-0-Ul.s1 NCI CGAP Su 


















































































Hs.272203 
























Hs 284450 












Hs.268591 






































































AI352096 
























AI052609 










312147 




Hs.1 95648 


ESTs 


0.67 


1.03 


31217S 


AA953383 


Hs.1 27554 


ESTs 


5.85 


10.60 


312179 


A1052572 


Hs.269864 


ESTs 


2.41 


3.32 


312201 


AI928365 


Hs.91139 


solute carrierfamily 1 (neuronal/epithe 


0.24 


0.39 


312207 


H90213 


Hs.1 91 330 


ESTs 


2.20 


4.55 


312220 


N74613 




gb:za55a07.s1 Soares fetal liver spleen 


4.28 


11.13 


312252 


AI128388 


Hs.1 43655 


ESTs 


1.64 


1.57 


312304 


AA491949 


Hs.269392 


ESTs 


0.12 


2.47 
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AW235092 


Hs.143981 










AA2 16698 




TERA protein 












































A1675558 












AI375096 
























AI863140 
























AI051133 












R59989 


Hs.176539 


































AI742591 












AI566228 




hypothetical protein PR02121 










Hs.35088 










AI193122 


Hs.124141 










AI865073 


Hs. 125720 










AA046451 


Hs.165200 






















AI240582 












AW450461 
























AW152104 
























AI016204 


























Hs .27 1695 










AI681581 




ESTs 








AI640506 




ESTs, Weakly similar to ALU7_HUMAN ALU S 




















AA497043 














Hs. 177337 






















AA731520 












AI419290 


Hs.1 49990 


ESTs, Weakly similar to unnamed protein 








AW293055 
































313070 


AI422023 


Hs.161338 






















AW449171 


























































313239 


















ESTs, Weakiy similar to testicular tekti 








AI770008 












AI027604 










313290 


AI753247 


Hs.29643 


Homo sapiens cDNA FLJ13103 lis, clone NT 








A1362991 


Hs.202121 


ESTs, Weakly similar to env protein [H.s 










Hs.127832 










AW074848 


Hs.201501 










AI674685 


Hs.200141 










AW376889 












AI241540 


Hs.132933 










AA741151 


Hs. 137323 










AA576052 


Hs.1 93223 


Homo sapiens cDNA FU11646 lis, clone Hb 








AI261390 




KIAA1345 protein 
































AI273419 


Hs.135146 


hypothetical protein FU13984 








AA041455 










313638 


AI753075 


Hs.1 04627 










AA740151 


Hs.130425 ' 










W49823 


Hs.104613 


RP42 homolog 








AW468891 


Hs.1 22948 


























































AVM36836 


Hs.1 44583 




















313790 


AW078569 












AW271022 










313834 


AW418779 


Hs!l 14889 


ESTs 


0.58 


3.U 


313835 


AI538438 


Hs.159087 


ESTs 


5.74 


8.88 


313852 


H18633 


Hs.1 23641 


protein tyrosine phosphatase, receptor t 


0.15 


1.14 


313854 


AW470806 


Hs.275002 


ESTs 


2.09 


4.06 


313865 


AA731470 


Hs.163839 


ESTs 


3.41 


4.09 


313871 


AW471088 


Hs.145950 


ESTs 


5.28 


6.83 


313883 


AI949384 




gb:nu76d01.s1 NCI_CGAF_Alv1 Homo sapiens 


2.90 


10.91 


313915 


AI969390 


Hs.1 63443 


Homo sapiens cDNA FLJ1 1576 lis, clone HE 


1.00 


1.00 
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313926 AW473830 Hs.171442 

313948 AW452823 Hs.135268 

313978 AI870175 Hs.13957 

313983 AI829133 Hs.226780 ESTs 
314035 

314037 AW300048 

314040 AA166970 

314067 AW293538 

314103 AI028477 

314107 AA806113 

314113 M218986 

314124 AW118745 

314126 AA226431 

314128 M935633 Hs.194628 ESTs 

314151 AA236163 Hs.202430 ESTs 

314184 AW081795 Hs.233465 ESTs 

314192 AW290975 Hs.1 18923 ESTs 

314244 AL036450 Hs.103238 ESTs 

314253 AA278679 Hs.189510 ESTs 
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Hs.270152 
Hs.275272 
Hs.118748 
Hs.51743 
Hs.132775 
Hs.189025 



314443 
314458 
314466 
314478 
314482 
314506 
314519 
314529 
314546 
314562 
314579 
314580 
314585 
314589 
314592 
314603 
314604 
314606 
314648 
314699 
314701 
314710 
314750 
314767 / 
314801 / 
314817 / 
314835 i 

314852 , 

314853 / 
314940 / 



AA292275 Hs.193746 ESTs 

AI628633 Hs.324679 ESTs 

AA827125 Hs.192043 ESTs 

AI217440 Hs.143873 ESTs 

AA767818 Hs.122707 ESTs 

AI521173 Hs.125507 DEAD-box protein 



AW007211 
AI564127 
AW197442 



Hs.210862 
Hs.202151 
Hs.16131 
Hs.143493 



Hs.255938 
Hs.216363 
Hs.153408 
M435761 Hs-192148 



Homo sapiens oDNA FLJ14056 fis. clone HE 
ESTs' 

hypothetical protein FLJ12876 

ESTs 

ESTs 

ESTs, Moderately similar to KIAA1200 pro 
ESTs 

Homo sapiens cDNA FLJ10570 fis, clone NT 



AI754634 
AI669131 
AI095005 



Hs.188767 ESTs 

Hs.132801 ESTs 

Hs.1 31987 ESTs 

Hs,290989 EST 

Hs.136174 ESTs 



Hs.76064 



Hs.153279 
Hs.1 62045 

314943 AI476797 Hs.184572 
314955 AA521382 Hs.192534 
314973 AW273128 Hs.300268 



315056 
315069 
315071 
315073 
315078 
315080 
315120 
315175 
315193 
315196 
315200 
315254 
315353 
315397 
315403 
315431 
315454 



AA527941 
AI538613 
AI493046 
AI569476 
AI202703 
AI821517 



AI808235 
AI474433 
AW452608 



Hs.298241 
Hs.146133 
Hs.177135 
Hs.152414 
Hs.105866 
Hs.152423 
AW452948 Hs.257631 



Hs.269477 
Hs.152530 
Hs.131765 



cell division cycle 2, G1 to S and G2 to 



Homo sapiens cDNA: FLJ21274 fis, clone C 



ns clone TCCCTAO0 151 ml 



Hs.179556 ESTs 

Hs.279610 hypothetical protein FLJ10493 



AA622104 Hs.184838 ESTs 



gb:qh36f02.x1 Soares_NFL_T_GBC_S1 Homo s 3.46 
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315530 AI200852 Hs.127780 ESTs 

315541 AI168233 Hs.123159 sperm 

315552 AW445034 Hs.256578 ESTs 

315562 AA737415 Hs.152826 ESTs 

315577 AW513545 Hs.17283 

315587 AI268399 Hs.140489 

315589 AW072387 Hs.158258 

315623 AA364078 Hs.258189 

315634 AA837085 Hs.220585 

315668 AA912347 Hs.136585 

315677 AI932662 Hs.164073 

315706 AW440742 Hs.155556 

315707 AI418055 Hs.161160 
315730 H25699 Hs.201591 
315745 AI821759 Hs.191856 
315791 AA678177 
315801 AA827752 Hs.266134 
315820 AI652022 Hs.258785 
315878 AA683336 Hs.189046 
315905 AI821911 Hs.209452 
315923 AI052789 Hs.133263 
315954 AW276810 Hs.254859 
315978 AA830893 Hs.119769 
316001 AI248584 Hs.190745 

316011 AW516953 Hs.201372 

316012 M764950 Hs.1 19898 
316040 AI983409 Hs.189226 
316048 AI720759 Hs.224971 
316076 AW297895 Hs.116424 
316124 AI308862 Hs.167028 
316151 AI806016 Hs.156520 
316187 AW518299 Hs.192253 
316204 AA731509 Hs.120257 
316232 AW297853 Hs.251203 
316275 AI671041 Hs.292611 
316291 AW375974 Hs.156704 
316303 AA740994 Hs.209609 
316344 AA744518 Hs.120610 
316346 AI028478 Hs.157447 
316365 AI627845 Hs.210776 
316380 AI393378 Hs.164496 
316470 M809902 Hs.243813 
316509 M76731C Hs.291766 
316514 M768037 Hs.291671 
316519 AI929097 
316609 AW292520 Hs.122082 
316633 AI125586 Hs.127955 
316700 AW172316 Hs.252961 
316711 AI743721 Hs.285316 
316713 AI090671 Hs.134807 
316715 AI440266 Hs.170673 
316787 AW369770 Hs.130351 
316809 AA825839 Hs.202238 

316811 M922060 Hs.132471 

316812 AW135045 Hs.232001 
316818 AA827176 Hs.124316 
316824 AA837416 Hs.124299 
316827 AI380429 Hs.172445 
316891 AW298119 Hs.202536 
316951 AA134365 Hs.57548 

316970 AA860172 Hs.1 32406 

316971 AA860212 Hs.170991 
316990 AA861611 Hs.130643 
317001 AI627917 Hs.233694 
317008 AW051597 Hs.143707 
317051 AA873253 Hs.126233 

317128 AA971374 Hs.125674 

317129 H12523 Hs.78521 
317137 AW341567 Hs.125710 
317196 AI348258 Hs.153412 
317212 AI866468 Hs.148294 

317223 AW297920 Hs.130054 

317224 D56760 Hs.93029 
317266 AA906289 
317282 AI807444 Hs.176101 
317285 AW370882 Hs.222080 
317302 AA908709 Hs.135564 
317304 AW449899 Hs.130184 
317320 AA927151 Hs.1 30452 
317413 AW341701 Hs.126622 
317417 AA918420 Hs.145378 
317452 AA972965 Hs.1 35568 
317519 AI859695 Hs.126860 
317521 AI824338 Hs.1 26891 
317529 AI916517 Hs.126865 
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hypothetical protein FLJ 10890 

mpiens mRNA; cDNA DKFZp434B1272 (f 



hypothetical protein FLJ20202 



ESTs, Moderately similar to ALU5_HUMAN A 
ESTs 

Homo sapiens cDNA: FLJ21 326 Ss, clone C 



ESTs, Moderately similar to ALU1.HUMAN A 



gb:od10c11.s1 NCI_CGAP_GCB1 Homo sapiens 4.41 

ESTs 1.00 

ESTs 2.51 

ESTs, Weakly similarto ALU1_HUMAN ALU S 3.46 

ESTs, Moderately similar to ALU7 HUMANA 4.45 

hypothetical protein FLJ12057 0.30 

ESTs, Weakly similarto AF126780 1 refin 0.20 



hypothetical protein FLJ11350 



Homo sapiens cDNA: FLJ21 193 f s, clone C 



Hs.203614 ESTs 



132 



Hs.199828 ESTs 

Hs.192123 ESTs 

Hs.132553 ESTs 

Hs.127346 ESTs 

Hs.127785 ESTs 
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317570 AI733361 Hs.127122 
317571 

317598 AW206035 

317627 AI346110 

317650 AI733310 

317659 M961216 

317674 AW294909 Hs.132208 

317686 AA969051 Hs.187319 

317692 AI307659 Hs.174794 

317701 AI674774 Hs.128014 

317711 AI733015 Hs.272189 

317722 AI733373 Hs.128119 ES" 

317756 AA973667 Hs.128320 

317777 AI143525 ' Hs.47313 

317799 AI498273 Hs.128808 

317803 AA983251 Hs.128899 

317821 AI368158 Hs.70983 

317848 AI820575 Hs.129086 

317850 N29974 Hs.152982 

317861 AW341064 Hs.129119 

317865 AI298794 Hs.129130 

317869 AW295184 Hs.129142 

317881 AI827248 Hs.224398 

317890 AI915599 Hs.129225 

317899 AI952430 Hs.150614 

317986 AI005163 Hs.201378 

318001 AW235697 Hs.1 30980 

318016 AI016694 Hs.256921 

318023 AW243058 Hs.131155 

318054 AW449270 Hs.232140 

318068 AI024540 Hs.131574 

318117 AI208304 Hs.250114 

31S187 AI792585 Hs.133272 

318223 AI077540 Hs.134090 

318240 AI085377 Hs.143610 

318255 AI082692 Hs.134662 

318266 AI554341 Hs.271443 

318330 AI093840 
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318428 AI949409 

318458 AI149783 

318467 AI151395 

318473 AI939339 

318476 AI693927 

318487 AI167877 

318488 AI217431 
318491 T26477 
318499 T25451 

318537 M377908 

318538 N28625 
318547 R20578 



318622 
318629 
316637 
318648 
318650 
318671 
318679 
318711 
318725 
318728 
318740 
318776 
318784 
318816 



319041 
319103 
319170 
319196 
319199 
319242 



H05896 
R13678 
F07953 
F07361 
F11472 



Hs.16085 
Hs.13306 
Hs.12839 



KIM0258 gene product 



RhoGAP 1 

Homo sapiens oDNA FLJ12007 fls, clone HE 
' FLJ13117 



Homo sapiens cDNA FLJ11469 fls, clone HE 



Hs.143758 
Hs.1 70974 
Hs. 194591 
Hs.1 58438 
Hs.144834 
Hs.146883 
Hs.265165 
Hs.143716 
Hs.144709 



Hs.90431 
Hs.90363 
Hs.1 07761 
Hs.49007 



AA779704 

AI470235 Hs.1 72698 

T48325 Hs.237658 

N25163 Hs.8861 

M243539 Hs.9196 

T77141 Hs.184411 

AA393302 Hs.176626 

M188823 Hs.299254 

T58115 Hs.1 0336 

AI936475 Hs.1 01282 

AI962487 Hs.2429S0 

Z30201 Hs.291289 

NM.002543 Hs.77729 

R24963 Hs.23766 

H00148 Hs.5181 

F07873 Hs.21273 
H10818 



ESTs, Weakly similar to unnamed protein 

poly(A) polymerase alpha 

Homo sapiens cDNA FLJ12136 fls, clone MA 



no sapiens cDNA: FU21238 lis, clone C 



proliferation-associated 2G4, 38kD 



Hs.18268 

Z43224 Hs.1 24952 

F08138 Hs.7387 

AW368520 Hs.301528 

Z43577 " 
AI219221 
Z44140 
Z44186 



ESTs 



Hs.269622 ESTs 



ESTs, Highly similar to MAON.HUMAN NADP- 
ESTs, Weakly similar to weak similarly 
KIAA1 31 3 protein 
putative selenocysteine lyase 
putative G-protein coupled receptor 
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319944 
319947 
319962 
320007 
320018 



320112 
320140 
320188 



320219 
320220 
320225 



320413 
320432 
320436 



Hs.6818 

Hs.290263 

Hs.12677 

Hs.12876 

Hs.79059 

Hs.285243 



Hs.325823 
Hs.13911 
Hs.301743 
Hs.191196 



Hs.191198 
Hs.1 11991 
Hs.116417 
Hs.19717 
Hs.184221 

Hs.250799 
Hs.270104 
Hs.191184 
Hs.14355 



Homo sapiens cDNA: FLJ21927 fis, clone H 
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319552 AA096106 



319609 , 
319611 I 
319653 , 



H06382 
R15372 

R18178 



N74880 

AA071267 

T78517 

AA258981 

T77559 

H79460 

T79248 



H63789 Hs.296288 
AI699772 Hs.292664 
M233671 Hs.87164 



AA074108 
H58138 
AW411307 
T92107 



R22291 
AA203711 
R62786 
M253352 



ESTs, Weakly similar to Y48A5A.1 [Celeg 



gb:yd52a10.s1 Soaresfelal liver spleen 
ESTs 

ESTs, Moderately similar to ALU8.HUMAN A 



Hs.1 06604 
Hs.167481 
Hs.21398 
Hs.21400 
Hs.22664 
Hs.21162 
Hs.295866 
Hs.22646 
Hs. 117414 
Hs.271350 
Hs.264330 

Hs.13941 
Hs.291392 
Hs.94109 
Hs.271722 
Hs.133510 
Hs.1 4479 
Hs.1 35056 



hypothetical protein FLJ21103 
gb:ym19c10.M Soares infant brain 1MB H 
uncharacterized hypothalamus protein HT0 



Homo sapiens mRNA; oDNA DKF2p434N1923 (f 3.02 



gb:zm61 g01 .r1 Stratagene fibroblast (937 



ESTs 

Homo sapiens cDNA FLJ14199 fis, clone NT 
Human DNA sequence from clone RP5-850E9 
gb;EST40943 Endometrial tumor Homo sapie 
gb:yd40h09.r! Soares fetal liver spleen 
ESTs, Weakly similar to KIAA0638 protein 
ESTs, Weakly similar to A4601 0 X-linked 
hypothetical protein FLJ14001 
EST 

FOXJ2 forkhead factor 



.117915 ESTs 
.1 1 431 1 CDC45 (cell division cycle 45, S.cerevis 



R62203 Hs.24321 

R78659 Hs.29792 

AL049227 Hs.1 24776 

AA327564 Hs.127011 



Homo sapiens cDNA FU12028 fis, clone HE 
ESTs 

Homo sapiens mRNA; cDNA DKFZp564N1 1 15 (f 



H03139 Hs.24683 
NM_003608 Hs.131924 
AL049337 Hs.132571 
H06019 Hs.151293 
AF077374 Hs.139322 
AI167978 Hs.139851 
AF026004 Hs.1 41660 
H10807 Hs.281434 
■ Hs.31286 
Hs.23368 
Hs.173269 
Hs.124136 



G protein-coupled receptor 65 

Homo sapiens mRNA; cDNA DKFZp564P016 (fr 

Homo sapiens oDNA FLJ10664 fis, clone NT 

small proline-rich protein 3 

caveolln 2 

Homo sapiens cDNA FU14028 fis, clone HE 
ESTs 

Homo sapiens clone FLC0578 PR02852 mRNA, 
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60 
65 
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320448 AI240233 Hs.80887 
320451 R26944 Hs.1 80777 
320484 AA094436 Hs.296267 
320499 R32555 Hs.24321 
320514 AB007978 Hs.158278 
320521 N31464 Hs.24743 

320526 AW374205 Hs.111314 

320527 R34672 Hs.324522 
320536 AA331732 Hs.137224 
320556 AF054177 Hs.14570 
320564 AF056209 Hs.159396 
320587 Z44524 Hs.1 67456 
320635 R54159 Hs.80506 
320639 AA243258 Hs.7395 

Hs.26549 
Hs.1 11334 
Hs.91251 
Hs.300511 
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320648 N48521 

320651 AA489268 

320664 AI904216 

320676 AA132650 

320683 R59291 

320689 M334609 

320696 AW135016 

320714 AI445591 

320727 U96044 

320771 AI793266 

320794 AA281993 

320822 AF100780 

320824 AF120274 

320830 AJ132445 

320843 AA317372 



AI205786 



Hs.266416 
Hs.34744 
Hs.34771 
Hs.135904 
Hs.271580 
Hs.199538 

.„,„.„ ,,:„ Hs.213923 

35 320957 AI878933 Hs.92023 
320997 H22544 

321045 W88483 Hs.293650 

321046 H27794 Hs.269055 
321052 AW372884 Hs.240770 

40 321059 AI092824 Hs.126465 
321062 R87955 Hs.241411 
AF131782 Hs.241438 
AA018306 

H43750 Hs.125494 
AI817933 Hs.298351 
AA336635 Hs.99598 



321067 
321102 
321130 
321142 
321155 
321158 
321170 
321199 
321206 
321225 
321236 
321244 
321270 
321317 
321318 



321418 
321420 
321430 
321453 
321467 
321468 
321491 
321498 
321504 
321510 
321513 
321516 
321565 
321577 
321581 
321582 
321587 
321626 
321628 
321642 
321669 
321687 
321688 



N53742 Hs. 172982 
AW385512 

H54178 Hs.226469 

AL080073 Hs.251414 

AW371941 Hs.18192 
AF068654 
R83560 

AI937060 Hs.6298 

AB033041 Hs.137607 

AB033100 Hs.300646 

AA127984 Hs.222024 

R93443 Hs.271770 

AI739161 Hs.161075 

AI368667 Hs.132743 
U05890 

N50080 Hs.82845 

AA514198 Hs.38540 

H70665 Hs.292549 

AW295517 Hs.255436 

W02356 Hs.268980 

AA703650 Hs.255748 

H84972 Hs.108551 

AI382803 Hs.1 59235 

AI525773 Hs.266514 
H84260 

AA019964 Hs.28803 

M143756 Hs.21858 
H95531 

AA295430 Hs.96322 

H87064 Hs.161051 

AW085917 Hs.247084 

H95404 Hs.294110 
AA625149 

H97646 Hs.1 231 58 

AA700017 Hs.173737 

N55160 Hs.167260 

AW390923 Hs.42568 



Homo sapiens cDNA FLJ12028 fis, clone HE 

KIAA0509 protein 

hypothetical protein FLJ20171 

ESTs 

ESTs 

ESTs 

hypothetical protein FLJ22530 
peptjdylglycine alpha-amidatlng monooxyg 
Homo sapiens mRNA full length insert cDN 
small nuclear ribonucleoprotein polypept 
hypothetical protein FU 231 82 
Homo sapiens mRNA for KIAA1 708 protein, 
ferritin, light polypeptide 

' '1 FU11198 



gbryq04a10.r1 Soares fetal liver spleen 
immunoglobulin lambda locus 
poly(A)-binding protein, nuclear 1 
ESTs 

WNT1 inducible signaling pathway protein 



Homo sapiens mRNA; oDNA DKFZp547C13S (fr 



nuclear cap binding protein subunit 2, 2 
ESTs 

Homo sapiens mRNA full length insert cDN 
Homo sapiens clone 24941 mRNA sequence 
gb:ze40d08.r1 Soares retina N2b4HR Homo 
ESTs 

ASPL protein 

hypothetical protein MGC5338 
gb:yu75f1 1 ,r1 Soares fetal liver spleen 
ESTs 

gb:yy56d10.s1 Soares_multiple.sclerosis_ 
Homo sapiens cDNA FU12417 fis, clone MA 
Homo sapiens mRNA; cDNA DKFZp564B1462 (f 
Ser/Arg-related nuclear matrix protein ( 
gb:Homo sapiens isolate AN.1 immunoglobu 
gb:yv75c06.st Soares fetal liver spleen 
KIM1151 protein 
KIAA1215 protein 

KIM protein (similar to mouse paladin) 
transcription factor BMAL2 



gb:H.sapiens (DIG3) mRNA for immunoglobu 
Homo sapiens cDNA: FU21930fs, clone H 
gb:Human 2a12 mRNA for kappa-immunoglobu 



trinucleotide repeat containing 3 
gb;ys7Se02.r1 Soares retina N2b4HR Homo 
hypothetical protein FLJ23560 
ESTs, Moderately similar to ALU6_HUMAN A 



gb;af70c12.r1 Soares_NhHMPu_S1 Homosapi 
Homo sapiens cDNA FU 12830 fis, clone NT 
ras-related C3 botulinum toxin substrate 
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321709 N25847 Hs.10! 

321710 N35682 
321775 AI694875 
321777 AI637993 
321779 N42729 
321829 D81993 
321846 AA281594 
321879 AL1 09670 
321883 M426494 
321899 N55158 
321911 AF026944 



PCT/US02/12476 



Hs.259743 
Hs.202312 
Hs.202312 



RAB38, member RAS oncogene family 



tumor endothelial marker 8 



322002 AA328801 

322035 AL137517 

322044 AW340926 

322057 N92197 

322060 AI341937 

322070 U80769 

322083 AF074982 

322091 AI819863 

322125 R93901 

322130 R98978 

322147 AF085919 

322166 AF085958 

322173 H52567 

322178 H56535 

322179 H92891 
322186 H67346 
322196 W87895 
322212 AF087995 
322221 AI890619 

322277 AI640193 

322278 AF086283 
322284 AI792140 
322288 AL037273 
322320 AF086419 
322336 AA308526 
322339 W17348 
322366 AW404274 
322372 W25624 
322374 AI394663 
322378 AF064819 
322388 AI815730 
322416 AA223183 
322419 AA248987 
322425 W37943 
322431 AA069222 
322450 AA040131 
322465 AA137152 
322467 AF1 16826 
322473 M744286 
322509 T52172 
322523 W80398 
322527 AF147359 



Hs.29468 
Hs.293797 
Hs.181694 

Hs.132882 
Hs.272759 
Hs.158923 
Hs.84522 
Hs.306201 



Hs.1 53943 
Hs.122116 
Hs.201877 
Hs.247474 
Hs.298442 
Hs.14084 
Hs.34892 
Hs.141892 
Hs.25144 
Hs.286049 
Hs.180340 



322566 W87285 
322585 AA837622 
322635 AA679084 
AA007352 



322968 AI905228 
322971 C15953 
322981 AA493252 



gb:qt10e03.x1 NCLCGAP_GC4 Homo sapiens 
Homo sapiens mRNA for KIAA17S6 protein, 
ESTs, Highly similar toKIAA0535 protein 



gb:yq16o12.r1 Soares fetal liver spleen 



ts.106243 ESTs 



Hs.269187 ESTs 
Hs.211516 
Hs.1 34877 
Hs. 179662 
Hs.226389 



gb:yr88b03.r1 Soares fetal liver spleen 
gb:yt85d04.r1 Soares_pineal_gland_N3HPG 
gb:yt88g03.r1 Soares_pineal gland_N3HPG 
gb:yt94c02.s1 Soares_plneal_gland.N3HPG 



nucleosome assembly protein Mike 1 
ESTs 

gb:zd46f01 .r1 Soares_fetal_heart_NbHH 1 9W 



Hs.76152 decorin 



ESTs 

ESTs, Moderately similar to Osf2 [M.musc 
DESC1 protein 

hypothetical protein FLJ21032 

adaptor-related protein complex 3, mu 1 

ring finger protein 7 

KIAA1 323 protein 

ESTs 

ESTs 

phosphoserine aminotransferase 
putative protein-tyrosine kinase 
tRNA selenocysteine associated protein 



1916847 Hs.270947 ESTs 



gb:Homo sapiens full length insert cDNA 




striatin, calmodulin-binding protein 
gb:z!03g07.r1 Soares_fetalJiver_spleen_ 
gb:AF074666 Human fetal liver cDNA libra 
potassium voltage-gated channel, shaker- 
PRO0327 protein 
clone FLBT727 
hypothetical protein FU11 109 
Homo sapiens cDNA FU12280 fe, clone MA 



S100 calcium-binding protein A2 
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322988 C18727 Hs.1 71941 

323003 AI733859 Hs.149089 

323013 M134042 Hs.191451 

323025 AL1 57565 Hs.315369 

323032 AW244073 Hs. 145946 

323052 R21124 Hs.85573 

323064 AL1 19341 Hs.49359 

323098 AI700025 Hs.270471 

323102 AL119913 Hs.163615 

323155 AL135041 

323176 AW071548 Hs.82101 

323191 M195600 Hs.301570 

323225 AA205654 Hs.24790 

323232 M148722 Hs.224680 

323266 AW003352 Hs.243886 

323281 AI697556 Hs.292659 

323283 AA256014 Hs.86682 

323314 AA226310 Hs.1 91501 

323316 AL1 34620 Hs.280175 

323334 AI336501 Hs.77273 

323338 R74219 Hs.23348 

323348 AA233056 Hs.1 91518 

323351 AA704103 Hs.24049 

323359 AA234172 Hs.137418 

323360 AA716061 Hs.161719 
323405 AW139550 Hs.1 15173 
323420 AI672386 Hs.263780 
323434 AW081455 Hs.1 20219 
323445 AA253103 Hs.135569 
323449 AA282865 Hs.284153 
323492 H00978 Hs.20887 
323501 AA182461 Hs.84520 
323505 AI652287 

323515 AA282274 Hs.256083 

323541 AM85116 Hs.104613 

323545 AI814405 Hs.224569 

323635 R63117 Hs.9691 

323675 AA984759 Hs.272168 

323678 AL042121 Hs.20880 

323691 AA317561 Hs.1 45599 

323693 AW297758 Hs.249721 

323746 AW298611 Hs.1 2808 

323774 AA329806 Hs.321056 

323856 AA355264 Hs.267604 

323857 T18988 Hs.293668 
323870 AA341774 Hs.129212 
323876 AL042492 Hs.1 47313 
323885 AA344308 Hs.128427 
323911 AL043212 Hs.92550 
323919 M862973 Hs.220704 
323972 AI869954 Hs.1 82906 
324006 AA610011 Hs.208021 
324036 AI472078 Hs.303662 
324055 AA528794 Hs.128644 
324063 AW292740 Hs.272813 
324072 AA381829 

324092 AW269931 Hs.202473 

324096 AW377983 Hs.298140 

324129 AI381918 Hs.285833 

324132 AW504860 Hs.288836 

324214 AA412395 Hs.225740 

324227 AA295552 Hs.28631 

324266 AL047634 Hs.231913 

324275 AA429088 Hs.98523 

324281 AL048026 Hs.124675 

324290 AA432032 Hs.304420 

324303 AL1 18764 

324312 AI198841 Hs.128173 

324326 AL138153 Hs.300410 

324338 AL138357 Hs.145078 

324341 AW197734 Hs.99807 

324343 AW452016 Hs.293232 

324371 AA452305 Hs.270319 

324382 AW502749 Hs.24724 

324384 AA453396 Hs.1 27656 

324385 F28212 Hs.284247 
324388 AI924963 Hs.306206 
324432 AA464510 Hs.152812 
324497 AW152624 Hs.1 36340 
324510 A1148353 Hs.287425 
324580 AA492588 

324582 AA506935 Hs.1 32036 

324633 AA572994 Hs.325489 

324640 AW295832 Hs.134798 

324675 AW014734 Hs.1 57969 
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Homo sapiens oDNA: FU23075 fis, cli 



nuclear autoantigenio sperm protein (his 
ESTs 

Homo sapiens cDNA: FU21578 f s, clone C 



S-pnase kinase-associaied protein 2 (p45 



gb:EST382593 MAGE resequences, MAGK Homo2.21 



ESI 
dual oxidase 1 

gb-.EST94855 Activated T-cells I Homo sap 
Homo sapiens cDNA: FLJ22278 lis, clone H 
Homo sapiens cDNA: FU22502 fis, clone H 
Homo sapiens cDNA: FLJ221 35 fis, clone H 
hypothetical protein FU 12673 
ESTs 

Homo sapiens cDNA: FU22141 fis, clone H 



ESTs, Weakly similar to T1 4742 hypothet 0.14 
ESTs 3.71 
gb:DKFZp761P1910.r1 761 (synonym: hamy2) 0.95 



MFH-amplif ed sequences with leucine-ric 

KIAA1 349 protein 

KIM1491 protein 

hypothetical protein FLJ11215 

ESTs 

ESTs, Weakly similar to unnamed p-otein 
Homo sapiens cDNA FLJ11569 fe, clone HE 
gb:ng99c08.s1 NCl_CGAP_Thy1 Homo sapiens 

Ts, Weakly lar toALU1_H IMAI IU 
ESTs 

ESTs, Moderately similar to TTL MOUSE TU 
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324699 AW504732 Hs.21275 

324747 AA603532 Hs.130807 

324748 AA667467 Hs.292385 
324801 AI819924 Hs.14553 
324804 AI692662 

324828 AA843926 Hs.124434 

324855 AW152305 Hs.122364 

324866 A1541214 Hs.46320 

324871 AW297755 Hs.271923 

324886 AA806794 Hs.131511 

324889 D31010 

324948 AW383618 Hs.265459 

324953 AI264628 Hs.125428 

324958 AA625076 Hs.132892 

324988 T06997 Hs.121028 

325024 F13254 Hs.78672 

325105 H97109 Hs.105421 

325108 AA401863 Hs.22380 

325114 D83901 Hs.315562 

325146 AI064690 Hs.171176 

325149 D61117 Hs.187646 

325187 AI653682 Hs.197812 



hypothelical protein FU11011 
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25 
30 
35 
40 
45 



325740 
325792 
325819 
325883 



326108 
326163 
326165 
326189 
326204 
326230 
326274 
326360 



326720 
326742 
326770 



327036 
327040 
327053 
327075 
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327130 
327156 
327220 
327224 
327288 
327321 
327332 
327361 
327377 
327396 
327414 
327442 
327467 
327473 
327483 
327562 
327568 
327606 
327611 
327642 
327654 
327734 
327775 
327796 
327840 
327940 
327984 
328004 



328530 
328600 
328608 
328616 



328664 
328666 
328698 



329134 
329157 
329178 
329192 
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329764 
329816 
329860 
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330088 
330093 
330100 
330106 
330107 
330120 



330313 



330372 

330385 M449749 

330397 D14659 

330468 L10343 

330472 L24203 

330478 L38486 

330493 M27326 

330496 M31328 

330506 M51906 

330512 M80563 

330537 U1976S 

330547 U32989 

330551 U39840 

330568 U56244 

330599 U90437 

330601 U90916 

330605 X02419 

330609 X04741 

330617 X53587 

330630 X78669 

330644 Y07755 

330650 Z68228 

330660 AA347868 

330692 AA017045 

330707 AA1 33891 

330716 AA233707 

330717 M233926 
330722 AA243560 
330740 AA297746 
330742 M400979 
330744 AA406142 
330761 AA428286 
330760 AA448663 
330763 AA450200 
330786 D60374 
330790 T48536 
330814 AA015730 
330827 AA040332 
330844 M063037 
330901 AA157818 
330931 F01443 
330952 H02855 
330961 H10998 
330968 H16568 
331014 H98597 
331046 N66553 
331060 N75081 
331099 R36671 
331108 R41408 
331131 R54797 
331135 R61398 
331170 T23461 
331180 T32446 
331183 T40769 
331203 T82310 
331271 AA059347 
331306 AA252079 
331327 AA281076 
331341 AA303125 
331359 AA416979 
331363 AA421562 
331378 AA448881 
331384 AA456001 
331402 AA505135 
331422 F10802 



karyopherin alpha 5 (importin alpha 6) 

KIAA0103gene product 

protease inhibitor 3, skin-derived (SKAL 



Hs.12744 
Hs.55803 
Hs.267319 
Hs.284256 
Hs.29567 
Hs.7164 
Hs.23748 
Hs.30340 
Hs.191358 
Hs.157148 
Hs.83937 
Hs.21983 

Hs.4197 



miorofibrillar-associated protein 4 
endogenous retroviral protease 
guanine nucleotide binding protein (G pr 
phosphoinositide-3-kinase, regulatory su 
S100 calcium-binding protein A4 (calcium 
zinc linger protein 9 (a cellular relrov 
tryptophan 2,3-dioxygenase 
hepatocyte nuclear factor 3, alpha 
(NONE) 

gb:Human RP1 homolog mRNA, 3'UTR region 
Homo sapiens cDNA: FLJ21930 fis, clone H 
plasminogen activator, urc " - 



nteg-ir b la 4 



Hs.82845 
Hs.77274 
Hs.75118 
Hs.85266 
Hs.79088 
Hs.38991 
Hs.2340 
Hs.139293 
Hs.6702 
Hs.293690 
Hs.11571 
Hs.52620 
Hs.34382 
Hs.22654 
Hs.25691 
Hs.12393 
Hs.29643 
Hs.30469 
Hs.274337 
Hs.49136 



Hs.265398 ESTs, Weakly sin 



S100 calcium-binding protein A2 
junction plakoglobln 

ESTs, Weakly similar to ALU7_HL)MAN ALU S 



Homo sapiens voltage-gated sodium cha 
receptor (calcitonin) activity modifying 
dTDP-D-glucose 4,6-dehydratase 
Homo sapiens cDNA FU13103 fis, dons 



a disintegrin and metalloproteinase doma 
ESTs 

hypothetical protein KIAA1 165 
ESTs 

Homo sapiens cDNA FU11883 fis, clone HE 



gb:yg87b07.s1 Soares infant brain 1NIB H 



ls.6640 Human DNA sequence from PAC 75N1 3 on chr 



ESTs 

Homo sapiens cDNA FU 13496 fis, clone PL 
KIAA1462 protein 

anterior gradient 2 (Xenepus laevis) horn 
hypothetical protein FU 11088 
NADPH oxidase 4 
ESTs 

ESTs, Moderately similar to ALU7_HUMAN 
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331531 N51343 

331547 N54811 

331578 N67960 

331589 N71027 

331608 N89861 

331614 N92293 

331668 W69707 

331671 W72033 

331676 W79834 

331681 W85712 



331717 AA190888 Hs.1 53881 

331718 AA191404 Hs.1 04072 
331811 AA404500 Hs.301570 
331820 AA405970 Hs.97996 
331831 AA412031 Hs.97901 
331852 AA418988 Hs.98314 
331943 AA453418 Hs.21275 
331969 AA460702 Hs.82772 
331990 AA478102 Hs.1 39631 
332002 AA482009 Hs.1 051 04 
332027 AA489671 Hs.65641 
332029 AA489697 Hs.145053 
332033 AA489840 Hs.251014 EST 
332048 AA496019 Hs.201591 
332071 AA598594 Hs.205293 
332074 AA599012 
332083 AA600200 Hs.1 55546 
332085 AA600353 Hs.173933 
332125 M609861 Hs.312447 
332177 F10812 Hs.101433 
332180 H03348 Hs.7327 

332185 H10356 Hs.101689 
332203 H49388 Hs.317769 
Hs.101915 
Hs.324267 
Hs.269137 
Hs.26530 
Hs.146381 
Hs.21201 
Hs.101539 
Hs.101774 
Hs.1 01 850 
Hs.289068 
Hs.11112 
Hs.1 11758 
5.250700 



CDA14 

gb:yz15g04.s1 Sc 
gb:od74f04.s1 NCI_CGAP_Ov2 Ho 
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332240 N54803 

332261 N70294 

332275 R08838 

332280 R38100 

332299 R69250 

332304 R74041 

332314 T25862 

332384 M11433 

332434 N75542 

332445 T63781 

332453 L00205 

332458 M33493 

332504 AA053917 

332525 M17252 

332530 M31682 



332539 M412528 Hs.20183 



Hs.15 
Hs.278430 



332563 N92924 

332565 AA234896 

332594 M279313 

332634 "" 

332638 AA283034 

332640 AA417152 

332654 AA001296 

332665 M223335 

332692 AA496035 

332716 L00058 

332736 L13773 

332758 X93921 

332781 AA233258 
332792 
332816 



332959 
332982 
332984 



Hs.50640 

Hs.5101 

Hs.288217 

Hs.63788 

Hs.247926 

Hs.79070 

Hs.1 14765 

Hs.296938 

Hs.247112 



ras homolog gene family, member I 
ESTs, Weakly similar to rholekin |M.musc 
collagen, type III, alpha 1 (Ehlera-Danl 

nql type 1M! I j t r tefan 
Homo sapiens NY-REN-62 antigen mRNA, par 



transcription termination factor, miloc 
EST 

Homo sapiens mRNA; cONA DKFZp586L0120 (f 
hypothetical protein FLJ 1 1 01 1 
■ XI, alpha 1 



hypothetical protein FU20073 



KIAA1211 pr< 
gb:ae41e11.s1G£ 

KIAA1080 protein; Golgi-associated, gamm 



serum deprivation response (phosphatidyl 
RNA binding motif protein, X chromosome 
neciin 3,- DKFZP566B0846 protein 
ESTs 

hypothetical protein FLJ23045 
retinol binding protein 1 , cellular 
Homosapiens cDNA FU11918 fis, clone HE 
ESTs 
keratin 6A 
tryptase beta 1 

chromosome 14 open reading frame 1 
cytochrome P450, subfamily XXIA (steroid 
inhibin, beta B (aclivin AB beta polypep 
cysteine-rich motor neuron 1 
ESTs, Weakly similar to AF1 64793 1 prote 
cytokeratin 2 

protease, serine, 16 (thymus) 

E1A binding protein p300 

methyl CpG binding protein 2 (Retlsyndr 

JAK binding protein 
protein regulator of cytokinesis 1 
hypothetical protein MGC2941 
propionyl Coenzyme A carboxylase, beta p 
gap junction protein, alpha 5, 40kD (con 
v-myc avian myelocytomatosis viral oncog 
myeloid/lymphoid or mixed-lineage leukem 
dual specificity phosphatase 7 
hypothetical protein FU10902 
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333139 1-88 0.84 

333140 0.21 0.64 



15 
20 



334161 
334183 
334187 
334219 
334222 



334648 
334787 
334866 
334891 



65 336120 
335126 
335179 
335188 
335211 

70 335288 
335289 
335361 

75 335416 
335496 
335497 



335619 
335620 
0 . 335621 
85 335682 
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336021 
336034 
336033 
336065 



337028 
337043 
337046 
337054 
337128 
337162 
337183 
337184 



337605 
337671 
337755 
337786 
337809 



338110 
338112 
338145 
338148 
338158 
338161 
338179 
338182 
338189 
338197 
338199 
338215 
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TABLE 8B shows the accession numbers for those Pkeys in Table 8A lacking unigenelffs. For each probesei we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from Genbank ESTs and mRNAs. These sequences were clustered based on seq 
similait in 0 Clust ig and Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers for sequences comprising each cluster are listed in th< 



70 
75 



322060 
321430 
321467 
322125 
322166 
322173 
322178 
322179 
321577 
321587 
313723 
320997 
322278 
321687 



322664 
315454 
322687 
314852 
307783 
324072 
300627 
323505 
315791 
324303 



44320.1 
42705 J 
43034J 
46779.1 



AW340926AA249063 N86075 

AI341937 AW0C3063 U34725 AA904742 

X5741 4X57415 

X13075X13076 

R93901 AF075G73 R93902 

H69434AF085958H69846 

H52567 H52557 AF085970 H521S4 

H56S3S AF085980 H56712 



H84849 H84252 H84260 H86664 H85320 

H95531 H95521 H84529 

AA070412AA102346M081885 

H22544H46842AI204929 

W69304AF086283W69200 

AA5251 49 AA313030 M313052 H97463 

AA665089AA135130M484059AA102419AW877765 

W79150AF086419 

AI668646 AI734214 W17348 

AW979268 AA878419 AA431342 M431628 



46885.1 
1615102 1 
1615333.1 
111953 1 
627492.1 
47271.1 
218439.1 
129439.1 
47422.1 
814584.1 



979809 1 AL120701 AL135041 AL121524 

38927.1 AF147359 T5851 1 T58560 

473768 2 W88919W89125 

1574395.1 242308 H23514 

82296.1 AA005129AA679084M694399 

85042.1 M011522 AA702841 AA011691 AA330797 

380580.1 AI239464 AI239473AA625812 AI208703 

37372.1 AF074566 Al 1 1 0759 AF090902 

327472.1 AI903735 AA491 283 AI694953 AW976903 AA761 362 

697809.1 AI347274AW844024 

269032J AA381 722 AA381829 AW963906 AW963902 AA381242 

221345 1 AA488472 W27363 AA317053 BE082689 AW967036 BE079872 

196389~1 AW970512AA280251 AI652287 BE46S438 AI650725 AA551854 AA281574 AW571481 

403558J AA678177AA677034 

233842 1 AL1 1 8754 AA333202 H38001 

442885 1 AA847835AA768376 

333127.1 AA504850AA504911 
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302697 
302711 
302742 

310624 
302847 
304122 
303598 
311409 
312094 
319312 



320018 
319484 
318865 
312220 
319646 
312389 



321270 
314126 
320714 
306442 



) 02/086443 

328264J AA492588AA492498AA492571 

275087 1 T78054T79888AA398185 

398093 1 AI692552 AI393343 AI800510 A1377711 F24263 AA661876 

1515978.1 D31010D30991 D31168 D31166 D31465 

43219 1 AJ001409AJ001410 

45419J L08442 D51348 

458.39 L12061 

364430.1 T25451 AA585296AA585305 

34524 4 U38896 U88898 AA916056 T03285 AI341S94 AI359534 AI634031 U88897 

458.1 05 X98941 X98942 X98943 X98953 X98949 

77271 -5 H28966 

------ AA38281 4 AA40241 1 AA41 2355 
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837264 1 AI698839AI909260AI909259 

797889.1 Z78390T97427 

1540116.1 Z45481 F12393T74437 

1688823.1 R05329R01555 R08276 

1689571.1 T82930 R02424 T85145 

229583.1 AA335314 T82938 AA327744 AW967388AA639967 T10753 

1815987 1 T83263T85731 T85730 

1691553.1 T91772 R07257 R07098 

1535937.1 H10818F07831 Z43072 

1671607.1 N74613T98756T98589 

243305 1 R09692R09414AA346353 

902057.1 AI863140W80703 R43474 

1566853.1 H1 4957 R56522R1 1908 

291472 1 BE080180 AW827313 AW231970 AA995028 AA428584 AW872715 AW892508 AW854593 AA578441 AW975234 AA664937 AA984131 
AA528743 AA552874 M564758 AW063245 AI267534 AW0701 90 AW893483 AA770330 AA906928 AA906582 AA758746 AA551 71 7 
AW063311 AA429538 

579192 1 AW206447 AI248530 AI084433 AI400976 R16553 

112523J AA071267 T65940 T64515 M071334 

80531 1 AA01 8306 H38925AA001 221 

410938 1 H79670H47798AA700289 

212379.1 N34524AA305071 AW954303AA502335AI433430AI203597AW026670AW265323AW850787AA317554AW993643AW835572 

AW385512 AI334966 W32951 H62656 H53902 R88904 AW835732 

28832.-3 AA769156 



AA976899 
AA977348 
AA978186 
AA988646 
AA994530 



306656 
306686 
306751 

306892 
308106 
308154 



308588 

308643 
308673 
308697 
308778 
308808 
308875 
308886 
308898 



AI439473 
AI092465 
AI476803 
AI500600 
AI125111 
AI125152 
AI557041 
AI557135 
AI557246 
AI718299 
' AI719893 
AI745040 
AI760864 
A1767143 
AI811109 
AI318289 
AI832332 
AI833240 
AI858845 
AI870704 
AI873111 



305016 
305034 
305072 
305148 
305190 
303978 
303990 



305413 
305447 
321244 



AA626876 
M630128 
AA641012 
AA654070 
AA665955 
AW513315 
AW515455 
AW516449 
AW516611 
AA670480 
AA700201 

AA737856 
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305814 AA782866 

305637 AA806124 

305539 M806138 

305850 AA807709 

305690 AA813477 

305728 AA828209 

305759 M835353 

305792 AA845256 

307041 AI144243 

307091 AI167439 

307181 AI189251 

305901 AA872968 

305910 M875981 

307415 AI242118 

307426 AI243364 

307517 AI275055 

307551 AI281556 

307561 AI282207 

307608 AI290295 

307691 A1318285 

307730 AI336092 

307760 AI342387 

307764 AI342731 

307796 AI350556 

309045 AI910902 

309051 A1911975 

307807 AI351799 

307808 AI351826 
307820 AI3557S1 
307852 AI365541 
309122 A1928178 
309164 AI937761 
309177 AI951118 
307902 AI380462 
309299 AW003478 
309303 AW004823 
309476 AW129363 
309532 AW151119 
309747 AW264889 
309769 AW272346 
309799 AW276964 
309866 AW299916 

302679 311853J H65022 AA186889 

309923 AW340684 

309928 AW341418 

309931 AW341683 

309933 AW341936 

302705 31765.1 U09060U09061 

302789 34161.1 AJ245067 AJ245070 

304006 AW517947 

304024 T03036 

304026 T03160 

304028 T03266 

304046 T54803 

304061 T61521 

304063 T62535 

302802 34487.1 Y08250 Y08245 

304114 R78946 

304155 H68696 

304203 N56929 

304234 W81608 

304348 AA179868 

304430 AA347682 

304456 AA411240 

304521 M464716 

304626 AA476427 

304607 AA513322 

304735 AA576453 

304760 AA580401 

306015 AA89711S 

306063 AA906316 

306065 AA906725 

306104 AA910956 

306109 AA911861 

306242 AA932805 

306288 AA936900 

306396 AA970223 

330568 NOT.FOUND.entrez U56244 

330599 15323.-12 U90437 

331131 genbank_R54797 R54797 

331203 NOT_FOUND_entrez T82310 

331531 genbank.N51343 N51343 

331547 467396.1 AA828597 N5481 1 

332074 genbank_AA599012 AA599012 
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Sequence source. The 7 digit numbers in this column are Genbank Ide. 1 1 - Gl) numl 
sequence of human chromosome 22." Dunham I. et al„ Nature (1999) 402:489495. 
Indicates DNA strand from which exons were predicted. 
Indicates nucleotide positions of predicted exons. 
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fibers. For each predicted exon, we have listed the genomic 

s. "Dunham I. et al." refers to the publication entitled "The DNA 



332792 
332816 
332906 



Dunham, I. etal. 



Plus 2009620-2009738 
Plus 2510528-2510658 
Plus 2518145-2518213 



30 

35 

40 ■ 

45 

50 

55 

60 

65 

70 



333772 
333777 
333846 
333884 



333968 
334061 
334094 
334113 
334161 
334219 
334239 
334333 
334378 
334382 



Dunham, I. ■ 
Dunham, I. < 
Dunham, I. 



12716160-12716384 
13056569-13056693 
13603544-13603657 
13907239-13907370 
13915866-13916036 
14987847-14987940 
15032740-15032817 
15176123-15176470 
15333206-15333305 
18872214-18872317 
19299770-19299944 
20103970-20104058 



21436286-21436384 
21441390-21441471 
21634405-21634526 
21669118-21669328 
21774611-21774680 
22807292-22807445 



24740167-24740347 
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336034 
336038 
336107 
336632 
336633 
336634 



10 
15 
20 

25 



02/086443 

Dunham, I. 
Dunham, I. 
Dunham, I. 
Dunham, I. 
Dunham, I. 
Dunham, I. 



29014404-29014590 
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, I. sta\ 



337162 
337183 
337184 
337268 
337299 
337389 
337493 
337549 



338008 
338033 
338110 
338112 
338145 



338418 
338501 
338506 



332982 
332984 
332998 
333058 
333097 
333121 
333122 
333123 
333140 



334183 
334187 
334222 
334223 
334255 



Dunham, I. 
Dunham, I. 
Dunham, I. e 
Dunham, I. e 
Dunham, I. e 



Dunham, I. et.al. 
Dunham, I. et.al. 
Dunham, I. et.al. 



Dunham, t. et.al. 
Dunham, I. et.al. 
Dunham, I. et.al. 
Dunham, I. et.al. 
Dunham, I. et.al. 
Dunham, I. et.al. 
Dunham, I. et.al. 
Dunham, I. et.al. 
Dunham, I. et.al. 
Dunham, I. et.al. 
Dunham, I. et.al. 
Dunham, I. e!.a\ 
Dunham, I. et.al. 
Dunham, I. et.al. 



Dunham, I. . 
Dunham, 1. 1 
Dunham, 1. 1 
Dunham, 1. 1 



Dunham, I. eUl. 
Dunham, I. et.al. 
Dunham, I. et.al. 
Dunham, I. et.al. 



Dunham, I. - 
Dunham, 1. 1 
Dunham, 1. 1 
Dunham, 1. 1 




34474472-34474531 
3971764-3971900 
4449069-4449193 
5443027-5443101 



10384481-10384621 
10391398-10391600 
11386629-11386692 
144B985-11449085 
12808775-12808833 
13638107-13638181 



17089711-17089988 
17132477-17132547 
18062184-18062402 
18074402-18074501 



21244713-21244828 
21221871-21221953 
21509763-21509864 
24404720-24404899 
27236005-27236108 
27792166-27792272 
28410653-28410734 
29160655-29160725 
30077787-30078184 



31141580-31141765 
31456454-31456519 
31583467-31583536 
32216399-32215527 



2628296-2628109 
2632606-2632457 
2711704-2711565 



8217374-8217261 

8218238-8218063 

11832582-11832508 

11921456-11921205 

12732417-12732289 

12734365-12734269 

13200776-13200692 

14478333-14478172 



20078117-20077991 
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PCT/US02/12476 



20173311-20173218 
20341159-20341087 
21297367-21297214 



25421215-25421093 
25763806-25763747 
26320043-26319845 



26711437-26711300 
26977639-26977558 
27360474-27360400 



338671 
338676 
338726 
338779 
338871 



17407330-17407251 

17610892-17610821 

22215251-22215034 

24591853-24591771 

24610510-24610359 

26716579-26716481 

30015948-30015800 

33371317-33371258 

33376212-33376158 

1299296-1299194 

1346555-1346397 



4133203-4133081 

5347658-5347550 

9318438-9318301 

11794465-11794343 

12124716-12124658 



12878594-12878478 
13760865-13760780 
14055447-14055355 



22049171-22049081 
22311966-22311856 
24508421-24508346 
24637427-24637369 



27030151-27029795 
28301708-28301611 
28300921-28300790 
29614876-29614749 



Minus 32975145-32975053 

2630-2694 

162154-162264 



50921-51050 

118590-119172 

133794-133981 

79927-80217 

126724-126967 

73476-73574 

1065020-1065089 
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6682468 Plus 



205138-205269 
207533-207690 
1018-1176 
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25 
30 
35 



50 
55 



6448539 Plus 



5867124 Plus 



5867208 Minus 

5867212 Plus 

5867218 Minus 

5867230 Minus 

4567182 Plus 

6042048 Plus 

5867293 Plus 

5867320 Plus 

5867341 Plus 

5867435 Minus 

5867439 Plus 

6138928 Plus 

6015249 Minus 

6015249 Minus 

6015253 Plus 

6015278 Plus 

6015293 Plus 

6015302 Minus 

6671864 Minus 



326720 5552456 Plus 



5857657 Minus 



6531976 Plus 

5866841 Minus 

5867481 Plus 

5867516 Minus 

5867525 Minus 

5867534 Plus 

6249562 Minus 



5867759 
5867772 
5867775 

5867804 
5867811 
6004463 
5867868 
5867891 
5867910 
5867940 



7369-7441 



101911-102081 

105841-106035 

101307-101434 

172397-172491 

7831-8035 

410289-410404 

70854-70915 



661381-661510 



100091-100282 

99443-99778 

21166-21301 

1043-1199 

37517-37638 

59613-59770 

127553-127656 

35311-35406 



16023-16581 

18147-18339 

10217-10357 

75340-75455 

783670-783817 

2247267-2247437 

4041318-4041431 

4734947-4735069 

319951-320040 

20247-22343 

2462-2620 

48583-48773 

56361-56532 



61013-62130 

8702-8820 

102461-102586 

111483-111618 

88030-88151 

75101-75181 

181573-181662 

37610-37676 

343989-344474 

46152-46287 

200262-200495 
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327775 5867964 Minus 

327796 5867982 Plus 

327840 6249578 Minus 

330208 6013599 Plus 

330263 6671884 Minus 

328004 5867993 Minus 

328101 5868020 Plus 

328100 5868020 Minus 

328113 5868024 Minus 

328157 5868064 Plus 

328196 5868080 Minus 

328197 5868081 Minus 
327940 5868197 Minus 
327984 5868216 Plus 
328021 5902482 Plus 
328068 6117819 Plus 
328264 6381912 Plus 
330300 2905862 Minus 
328608 5868222 Minus 
328600 5868229 Minus 
328616 5868239 Plus 
328623 5868246 Minus 
328632 5868247 Plus 
328666 5868254 Minus 
328698 5868264 Minus 
328700 5868264 Plus 
328708 5868271 Minus 
328735 5868289 Plus 
328743 5868289 Plus 
328806 5868324 Plus 
328299 5868366 Minus 
328342 5368383 Plus 
328365 5868387 Minus 
328369 5868388 Plus 
328381 5868392 Plus 
328451 5868425 Minus 
328481 5868449 Minus 
328500 5868464 Plus 
328530 5868482 Plus 
328664 6004473 Plus 
328861 6381928 Minus 
328908 5868493 Plus 
328S33 5868500 Plus 
328934 5868500 Plus 
328949 6456765 Minus 
330313 6042030 Minus 
329005 5868542 Plus 
330366 2944106 Plus 
330372 6580495 Minus 
329033 5868561 Minus 
329037 5868562 Minus 
329067 5868591 Minus 
329134 5868679 Plus 
329157 5868687 Minus 
329178 5868704 Plus 
329192 5868716 Plus 
329194 5868716 Minus 
329204 5858720 Minus 
329224 5868728 Plus 
329228 5868728 Minus 
329288 5868771 Plus 
329337 5868806 Minus 
329011 6682532 Plus 



130791-130871 

85267-85405 

73065-73206 

66517-66931 

101503-101634 

157407-157887 

289920-290014 

263545-263635 

80378-80491 

73326-73615 




87770-87953 

38889-40010 

293920-294224 

120020-120126 

76734-76853 

778-901 

625555-625633 

764089-764203 

68114-68854 

89389-89455 

274638-274726 

29408-29684 

149708-149889 

59955-60094 

270724-270798 

75371-75583 

662758-662848 

217275-217336 

8987-9180 

59098-59481 

334973-335406 

1193739-1193866 

108317-108403 

117002-117059 

771755-771889 

846342-846448 

43552-43619 

33642-33775 

85470-85673 

151837-151914 

317461-317688 

5390-5479 

32466-32562 

146417-147652 

29959-30018 

145940-146155 

179177-179463 

166936-167020 

304450-304559 

3050-3190 

27422-27664 

50118-50287 

25554-26299 
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TABLE 9A: Potential Therapeutic, Diagnostic and Prognostic targets for Therapy of Lung Cancer 
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Table 9B show the accession numbers for those Pkey's lacking UnigenelD's for table 9A. For each p-obeset we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from Genbank ESTs and mRNAs. These sequences were clustered based on sequence 
similarity using Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers for sequences comprising each cluster are listed in the 
"Accession" column. 



Pkey: Unique Eos probeset identifier number 

ExAccn: Exemplar Accession number, Genbank accession number 

UnigenelD: Unigene number 

Unigene Title: Unigene gene title 

R1 : Average of lung tumors (including squamo 
average of normal lung samples 

R2: Average of non-malignant lung disease samples (including bronchitis, emphysema, fibrosis, atelectasis, asthma) divided by the avi 



adenocarcinomas, small cell carcinomas granulomatous and carcinoid tumors) divided by the 



400195 
400205 
400220 
400277 
400285 

400288 X06256 

400289 X07820 
400298 AA032279 
400301 X03635 
400303 AA242758 
400328 X87344 
400419 AF084545 
400512 



400749 
400763 
401027 
401093 



401747 
401760 
401780 
401781 
401785 
401797 
401961 

401985 AF053004 
401994 



UnigenelD Unigene Title 

NM_007057*:Homo sapiens ZW10 interactor 
NM_006265*:Homo sapiens RAD21 (S. pombe) 
Eos Control 
Eos Control 
Eos Control 

Hs.1 49609 integrin, alpha 5 (fibronectin receptor, 
Hs.2258 matrix metalloproleinase 1 0 (slromelysin 
Hs.61 635 six transmembrane epithelial antigen of 
Hs.1657 estrogen receptor 1 
Hs.79136 LIV-1 protein, estrogen regulated 
Hs.1 80062 transporter 2, ATP-bindlng cassette, sub 
Target 

NM_030878*:Homo sapiens cytochrome P450, 

NMJ)30878«:Homo sapiens cytochrome P450, 
NM_002425:Homo sapiens matrix metallopro 
NM_002425:Homo sapiens matrix metallopro 
NM_002425:Homo sapiens matrix metallopro 
NMJM31 05*:Homo sapiens sorti'in-related 
Target Exon 
Target Exon 

C12000586-:gi|6330167|dbj|BAA86477.1 1 (A 
Target Exon 

C12000457':gi|7512178|pir||T30337 polyp? 
ENSP00000247172'.HYPOTHETICAL 126.2 kDa 
C14000397*:g|7499898|pir||T33295 hypotti 
histone deacetylase5 

ENSP00000241802*:CDNAFLJ11007 FIS, CLON 
Homo sapiens keratin 1 7 (KRT17) 
Target Exon 

NM_005557":Homo sapiens keratin 16 (foca 
Target Exon 

NM_002275*:Homo sapiens keratin 15 (KRT1 
Target Exon 

NM_021626:Homo sapiens serine oarboxypap 
class I cytokine receptor 
Target Exon 



R1 



R2 



402297 
402408 
402420 
402674 
402802 
402994 
403137 

403306 NM.006825 



403478 
403485 
403627 
403715 



NM_001436':Homo sapiens fibn'llarin (FBL 
Target Exon 
Target Exon 



C1000823*:gi(10432400(emb(CAC1 0290.1 1 (A 
Target Exon 

NM_001397:Homo sapiens endothelin corner 
NM 002463*:Homo sapiens myxovirus (influ 
NMJ)05381*:Homo sapiens nuclcolin(NCL), 
transmembrane protein (63kD), endoplasmi 
Target Exon 

ENSP00000231844*:Ecotrapic vims Integra 
NM 022342:Homo sapiens kinesin protein 9 
C3001 813*:gi|1 2737279|ref|XP_01 2163.1 1 k 
Target Exon 
Target Exon 

ENSP00000237855*:DJ398G3.2 (NOVEL PROTEI 
NM_016020*:Homo sapiens CGI-75 protein ( 
C8000950:gi|423560|pir||A47313 RNA-bindi 
' 'M_00551 0:Homo sapiens ret finger protei 



Target Exon 
NM_005936:Homo sapiens myeloid/lymphoid 
NM_021058*:Homo sapiens H2B histone fami 
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404267 
404298 



5 

10 
15 

20 



406360 
406399 
406467 

406621 X57809 

406642 AJ246210 

406663 U24683 

406671 AA129547 

406673 M34996 

406676 X58399 

406678 U77534 

406686 M18728 

406687 M31126 
406690 M29540 
406698 X03068 
406816 AA833930 
406851 AA609784 
406964 M21305 
406967 M24349 
406974 M57293 
407103 M424881 
407128 R83312 
407137 T97307 
407168 R45175 
407239 AA076350 
407242 M18728 
407244 M10014 
407289 M135159 
407300 AA102616 
407366 AF026942 
407378 M299264 
407430 AF1 69361 
407463 AJ132087 
407677 AW131324 
407634 AW016569 
407710 AW022727 
407720 AB037776 
407746 AK001962 
407766 M1 16021 
407768 D50915 
407782 AA608956 
407788 BE514982 
407790 AI027274 
407811 AW190902 
407839 M045144 
407944 R34008 
408000 L11690 
408031 AA081395 
408063 BE086548 
408070 AW148852 
408101 AW968504 
408122 AI432652 
408212 M297567 
408243 Y00787 
408349 BE546947 

408353 BE439838 

408354 AI382803 
408369 R38438 
408380 AF123050 
408482 NM.000676 
408522 AI541214 
408636 AW381532 
408545 AW235405 
408572 AA055611 
408633 AW963372 
408660 AA525775 
408761 AA057264 
408771 AW732573 



C6001909:gi|704441 |dbj|BAA1 8909. 1 1 (D298 
C6001238*:gi|121715|sp|P26697|GTA3_CHICK 
Target Exon 

NM_021048:Homo sapiens melanoma antigen, 
NM_005596*:Homo sapiens nuclear factor I 
choiesteryl ester transfer protein, plas 
Target Exon 

NM_005365:Homo sapiens melanoma anligen, 
Target Exon 

CY000047 , :gi|11427234|ref|XP_009399.1|z 
NMJ3141 3*:Homo sapiens cat eye syndrome 
Target Exon 

C12000200:gi|4557225|reflNP_000005.1|al 
cytochrome c-1 

NM_002362:Homo sapiens melanoma antigen, 
C15C00305:oii3805122[gb;AAC59198 1, [AFO 
NM_000179*:Homo sapiens mutS (E, coli) h 
Target Exon 

NM_003122*:Homo sapiens serine protease 
Target Exon 
Hs.1 81 1 25 immunoglobulin lambda locus 
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Hs.293441 
Hs.285754 
Hs-193253 
Hs.81221 



immunoglobulin heavy constant n 
met prato-oncogene (hepatocyte 
major histocompatibility complex, 
Human I ~ ~ ' 
gb:Hum 



Hs.272822 
Hs.220529 
Hs.73931 



gb:Human nonspecific crossreact'ng antig 
pregnancy specific beta-1-glycoprotein 9 
carclnoembryonic antigen-related cell ad 
major histocompatibility complex, class 
tRNA isopentenylpyrophosphate transferas 
major histocompatibility complex, class 
gb:Human alpha satellite and satellite 3 
gb:Human parathyroid hormone-like protei 
gb:Human parathyroid hormone-related pep 
hypothetical protein MGC13170 
EST 

gb-.ye53h05.s1 Soares fetal liver spleen 



Hs.288941 

Hs.40098 

Hs.161566 

Hs.239727 

Hs.620 

Hs.42173 

Hs.42346 

Hs.123073 

HS.42B24 

Hs.43728 

Hs.624 

Hs.44276 

Hs.44298 

Hs.159235 

Hs.182576 

Hs.44532 

Hs.45743 

Hs.46320 

Hs.135188 



Hs.46677 

Hs.238936 
Hs.47584 



fibrinogen, gamma polypeptide 
Homo sapiens cDNA FLJ12149 fis, clone MA 
gb:zn43e07.s1 Siratagene HeLa cell s3 93 
gb:Homo sapiens cig33mRNA, partial sequ 
ESTs, Moderately similar to 138022 hypot 
gb:Homo sapiens protein tyrosine phospha 
gb:Honx>sap:er= mRNAfo: a-crcma 1 c> rc:ii 
hypothetical protein MGC12538 
UDP-OcNtebctaGa- aota •,3-H.;.tYtv'v„; 
ESTs 

KIAA1 355 protein 

hypothetical protein FU11100 

ubiauilin ;pec:-"c protease 18 

KIAA0125gene product 

ES c. Mode w:> ;ntcrtol-UK';ri tUI 

S100 calcium-binding protein A2 

Homo sapiens cDNA FLJ14TO fs :'o,r- PI 



bullous pemphigoid antigen 1 (233 24CkC) 
Homo sapiens cDNA FLJ10356 fis, clone NT 
calcineurin-binding protein calsarcin-1 
gb:xf05d05.x1 NCI_CGAP_Bm36 Homo sapien 
COC2-relaled protein kinase 7 
hypothetical protein FU10718 
hypothetical protein 
Interleukin 8 
homeo box C10 

mitochondrial ribosomal protein S17 
ESTs 

solute carrierfamily 15 (H777 transport 



ESTs, Moderately similar to ALU4 HUMAN A 
PRO2000 protein 

ESTs, Moderately similar to PC4259 ferri 
ESTs, Weakly similar to (defline not ava 
potassium voltage-gated channel, delayed 
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Hs 344096 






































































































































































BE220053 












































































































AI769160 




























Homo sapiens mRNA; cDNA DKFZp586P2321 (f 








NMJJ01898 
















gb;UI-HF-BR0p-ajr-f-11-0-Ul.r1 NIH MGC 5 


































Hs.278025 


















































gb"RC3~BT0319-120200-014-a09 BT0319 Homo 












KIM0918 protein 










Hs.58169 


highly expressed in cancer, rich in leue 




































Hamo sapionscDNA FU 14035 fis clone HE 












hypothetical protein FU12691 




















AJ 132592 




































AW182833 






















410407 


X66839 




































AW016824 
























































































:HI division cycle 2-like 1 (PITSLRE pr 












gb'QV3-BT0379-01 0300-1 05-g03 BT0379 Homo 










































































































































411800 


N39342 


Hs.'l 03042 


microtubuie-associated protein 1B 


23,34 


34.00 


411945 


AL033527 


Hs.92137 


v-myc avian myelooytomatosis viral oncog 


1,00 


8.00 


412115 


AK001763 


Hs.73239 


hypothetical protein FU10901 


2.07 


1.64 


412140 


AA219691 


Hs.73625 


RAB6 interacting, kinesin-like (rabkines 


118.48 


92.00 


412276 


BE262621 


Hs.73798 


macrophage migration inhibitory factor ( 


1.98 


1.49 


412464 


T78141 


Hs.22826 


ESTs We n,smilartol55214salivary 


1.16 


1.34 


412530 


AA766268 


Hs.256273 


hypothetical protein FU1 3346 


41.52 


84.00 


412537 


AU31778 




nuclear transcription factor Y, alpha 


17.90 


55.00 
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412811 H06382 

412817 AL037159 Hs.74619 

412863 M121673 Hs.59757 

412924 BE018422 Hs.75258 

413004 T35901 Hs.75117 

413011 AW068115 Hs.821 

413048 M93221 Hs.75182 

413063 AL035737 Hs.75184 

413129 AF292100 Hs.104613 

413142 M81740 Hs.75212 

413223 AI732182 Hs.191866 

413248 T64858 Hs.21433 

413273 U75679 Hs.75257 

413278 BE563086 Hs.833 

413281 AA861271 Hs.222024 

413364 BE536218 Hs.137516 

413385 M34455 Hs.840 

413409 AI633418 Hs.1440 

413453 AA129640 Hs.128065 

413527 BE250788 Hs.1 79882 

413554 AA319146 Hs.75426 

413573 AI733859 Hs.1 49089 

413582 AW295547 Hs.71331 

413597 AW302385 Hs.1 17183 

413690 BE157489 

413691 AB023173 Hs.75478 
413719 BE439580 Hs.75498 
413753 U17760 Hs.75517 
413801 M62246 Hs.35406 
413833 Z15005 Hs.75573 
413882 AA132973 Hs.184492 
413926 AA133338 Hs.54310 
413943 AW294416 Hs.1 44687 
413995 BE048146 Hs.75671 
414035 Y00630 Hs.75716 
414142 AW368397 Hs.334485 
414180 AI863304 Hs.120905 
414245 BE148072 Hs.75B50 
414275 AW970254 Hs.889 
414317 BE263280 Hs.75888 
414334 AA824298 Hs.21331 
414341 D80004 Hs.75909 
414368 W70171 Hs.75939 
414416 AW409985 Hs.76084 
414430 AI346201 Hs.76118 
414570 Y00285 Hs.76473 
414618 AI204600 Hs.96978 
414675 R79015 Hs.296281 
414683 S78296 Hs.76888 



olfactomedin related ER localized protei 
ESTs 

hypothetical protein AF301 222 
ESTs 

proteasome (prosome, macropa'n) 26S subu 

zinc finger protein 281 

H2A hislone family, member Y 

interleukin enhancer binding factor 2, 4 

biglycan 

mannose receptor, C type 1 
chitinase 3-like 1 (cartilage glycoprote 
RP42homolog 
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hypothetical protein DKF2p547J036 
stem-loop (histone) binding protein 
interferon-stimulated protein, 15 kDa 
transcription factor BMAL2 



hypothetical protein MGC5350 
ESTs 

gb:RC1-HT0375-120200-01 1-e06 HT0375 Homo 

ATPasa, Class VI, type 11B 

small inducible cytokine subfamily A (Cy 

laminin, beta 3 (nicein (125kD), kalinin 

ESTs, Highly similar to unnamed protein 

cenlromere protein E (31 2kDJ 

ESTs 

ESTs 

Homo sapiens cDNA FLJ12981 lis, clone NT 
syntaxin 1A (brain) 

serine (or cysteine) proteinase inhibito 
Homo sapiens cDNA FLJ14438 lis, clone HE 
Homo sapiens cDNA FLJ 11448 fis, clone HE 
WAS protein family, member 1 
Charot-Leyden crystal protein 
phosphogluconate dehydrogenase 
hy itheti J protein FU10036 
KIAA01 82 protein 



insulin-like growth factor 2 receptor 
hyp thet di protein MGC10764 
interleukin enhancer binding factor 1 



Hs.76 

Hs.288735 Homo sa 



414711 

414718 H95348 Hs.1 07987 

414732 AW410976 Hs.77152 

414747 U30872 Hs.77204 

414761 AU077228 Hs.77256 

414774 X02419 Hs.77274 

414806 D14694 Hs.77329 

414809 AI434699 Hs.77356 

414812 X72755 Hs.77387 

414825 X06370 Hs.77432 

414839 X63692 Hs.77462 

414883 AA926960 

414907 X90725 Hs.77597 

414914 U49844 Hs.77613 

414945 BE076358 Hs.77667 

414972 BE263782 Hs.77595 

415014 AW954064 Hs,24951 

415091 AL044872 Hs.77910 

415138 C18356 Hs.295944 

415227 AW821113 Hs.72402 

415238 R37780 Hs.21422 

415263 AA948033 Hs.1 30853 

415295 R41450 Hs.6546 

415339 NM.015156 Hs.78398 

415669 NM.005025 Hs.78589 

415674 BE394784 Hs,78596 

415709 AA649850 Hs.278558 

415735 AA704162 Hs.120811 

415799 AA353718 Hs.225841 

415817 U88967 Hs,78867 

415857 AA866115 Hs.127797 

415989 A1267700 



minichromosome maintenance deficient (S. 
centromere protein F (350/400kD, mltosin 
enhancer of zeste (Drosophila) homolog 2 



or receptor (avian 
DNA (cytos"ine-5-)-methyltransferase 1 
CDC28 protein kinase 1 
polo (Drosophia)4ike kinase 
ataxia telangiectasia and Rad3 related 
lymphocyte antigen 6 complex, locus E 
KIAA0008 gene product 



proteasome (prosome, macropain) subunit, 
ESTs 

ESTs, Weakly similar to 138022 hypothec 
DKFZP434D1 93 protein 
protein tyrosine phosphatase, receptor-t 
Homo sapiens cDNA FU1 1381 fis, clone HE 
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416018 AW138239 

416065 BE267931 

416111 AA033813 

416177 AA174069 

5 416178 AI808527 

416208 AW291168 

416209 M236776 
416239 AL038450 
416250 AA581386 

10 416322 BE019494 

416423 H54375 

416448 L13210 

416498 U33632 

1 416658 U03272 

15 416661 AA634543 

416722 M354604 

416819 U77735 

416936 N21352 

417034 NMJ06183 

20 417061 AI675944 

417079 U65590 

417218 M129547 

417233 W25005 

417308 H60720 

25 417315 AI030042 

417324 AW265494 

417366 BE185289 



.J.78977 
Hs.73996 
Hs.79018 
Hs.137607 



proprotein convertase subtilisin/kexin t 
proliferating cell nuclear antigen 
chromatin assembly factor 1, subunitA ( 
ESTs 

serologically defined breast cancer anti 

ESTs, Weak!/ similar to MUC2 HUMAN MUCIN 

MAD2 (mitotic arrest deficient, yeast, h 
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417428 N87579 

30 417433 BE270266 

417466 AI681547 

417512 AI979168 

417515 L24203 Hs.32237 

417542 J04129 Hs.82269 

35 417576 AA339449 Hs.82285 

417715 AW969587 Hs.36366 ESTs 



Hs.79351 
Hs.79432 
Hs.79440 
Hs.122546 
Hs.80205 
Hs.42987 
Hs.30962 
Hs.1 88691 
Hs.81134 
Hs.235754 
Hs.24395 
Hs.81892 
Hs.130450 

Hs.1076 
Hs.82045 
Hs.278871 
Hs.32128 
Hs.59457 
Hs.344096 



417791 AW965339 

417830 AW504786 

417866 AW067903 

417900 BE250127 

417933 X02308 

417944 AU077196 

417975 AA641836 

417991 AA731452 

418004 U37519 

418007 M13509 

418054 NM.002318 

418057 NMJJ12151 

418113 AI272141 



418322 AA284166 

418327 U70370 

418345 AJ001696 

418379 AA218940 



lls.83484 
Hs.83551 
Hs.83758 
lls.34772 



418462 I 
418478 I 
418506 i 



Hs.84746 
Hs.84790 
Hs.85266 
Hs.1174 
Hs.35339 
Hs.85838 
Hs.85951 
Hs.85962 



418641 BE243136 

418661 NMJ01949 

418663 AK001100 

418678 NMJ01327 

418686 Z36830 

418689 AI360883 

418712 Z42183 

418727 AA227609 

418738 AW3B8633 

418819 AA228776 

418830 BE513731 

418882 NMJ04996 



potassium channel, subfamily K. member 1 
fibrillin 2 (congenital contracture! ara 
IGF-II mRNA-binding protein 3 
hypothetical protein FLJ2301 7 
plm-2 oncogene 

ESTs, Weakly similar to S21348 probable 



Homo sapiens cDNA FLJ1 2033 fls, clone H E 

interleukin 1 receptor antagonist 

met proto-oncogene (hepatocyte growth fa 

small inducible cytokine subfamily B (Cy 

KIAA0101 gene product 

ribosomal protein S24 

ESTs 

small proline-rich protein 1B (cornifin) 
midkine (neurite growth-promoting factor 
gb:LL2030F Human fetal heart, Lambda ZAP 
5T4 oncofetal trophoblast glycoprotein 
hypothetical protein FLJ221 27 
glycoprotein (transmembrane) nmb 
ataxia-telangiectasia group D-associated 



Hs.122579 
Hs.82772 
Hs.82906 
Hs.82962 
Hs.82985 
Hs.30085 
Hs.1 90008 
Hs.87539 



ESTs 

hypothetical protein FLJ10461 

collagen, type XI, alpha 1 

CDC20 (cell division cycle 20, S. cerevi 

thymidylate synthetase 

collagen, type V, alpha 2 

1 ■ FU23186 



aldehyde dehydrogenase 3 family, member 
matrix motalloproteinase 1 (Interstitial 
lysyl oxidase-like 2 

coagulation factor Vlll-associated (intr 
SRY (sex determining region Y)-box 4 
microiibrillar-associated protein 2 
CDC28 protein kinase 2 
ESTs 

AF15q14 protein 
ESTs 

KIAA1 323 protein 

oviductal glycoprotein 1 , 1 20kD (mucin 9 
cathepsin K (pycnodysostosis) 
Homo sapiens cDNA: FU21578 lis, clone C 
cyclin-dependent kinase inhibitor 3 (CDK 
paired-like homeodomain transcription fa 
serine (or cysteine) proteinase inhibito 
fidgetln-like 1 

chromosome condensation 1 
KIAA0225 protein 
Integrin,beta4 

cyclin-dependent kinase Inhibitor 2A (me 
G protein-coupled receptor 39 
solute carrier family 16 (monocarboxyllc 
sxportin, tRNA (nuclear export receptor 
hyaluronan synthase 3 
M-phase phosphoprotein 9 
" ■■ '' ' '"ngroupA 



Hs.41690 
Hs.87225 
Hs.87268 
Hs,274448 

Hs.94834 
Hs.6682 
Hs.191721 



cancer/testis antigen 



solute carrier family 7. (cationic an 
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AA360392 


Hs.87113 






AA233056 










Hs.89584 


r n su i ino rri a-as s oci ate d 1 




AW014836 


Hs, 18844 






A W1 50835 




hypothetical protein FU21620 




AI538323 ' 


Hs.52620 


integrin, beta 8 




















NML002846 


Hs.89655 


protein tyrosine phosphatase, receptor t 






Hs.89663 


cytochrome P450, subfamily XXIV {vitamin 




AU076718 








AA256106 








AW960146 


Hs.2841 37 


hypothetical protein FU128o8 














Hs. 90073 








Hs.90315 


KIAA0007 protein 








gb:HUM316G10B Clontech human aorta polyA 








PTK7 protein tyrosine kinase 7 


419474 


AW968619 


Hs!l55849 


ESTs 


419485 


AA489023 


Hs.99807 


ESTs, Weakly similar to unnamed protein 




AA316241 


Hs.90691 


nucleophosmin/nucleoplasmin 3 




AU076704 




fibrinogen, A alpha polypeptide 




AF070590 


Hs.90869 


Homo sapiens clones 24622 and 24523 mRNA 






Hs.91093 


chitinase 1 (chltotriosidase) 








jagged 1 (Alagllle syndrome) 




M013051 


Hs. 128151 


topoisomerase (DNA) II binding protein 




NMJX11650 


Hs.288650 














NM_007019 


Hs.93002 


ubiquitin carrier protein E2-C 




AF042001 




slug (chicken homoiog], zinc finger prot 




AA249573 


Hs.152618 


ESTs, Moderately similar to ZN91JHUMAN Z 






Hs.93304 


phospholipase A2, group VII (platelet-ac 




AI792788 




gb:ol91d05.y5 NCI_CGAP_Kid5 Homo sapiens 




AB040959 


Hs.93836 


DKFZP434N01 4 protein 






Hs.94030 


Homo sapiens mRNA; cDNA DKFZp566E1624 (f 


420005 


AW271105 


Hs.1 33294 










brefeldin A-inhibited guanine nucleotide 








Homo sapiens cDNA FLJ10551 fis, clone NT 




BE378432 




cyclin-dependent kinase 4 




AW374968 


Hs.348112 


Human DNA sequence from clone RP5-1 1 03G7 




AF004884 




calcium channel, voltage-dependent, P/Q 






Hs. 323494 






AW043637 




ESTs, Weakly similar to ALU5JHUMAN ALU S 




NM_001756 




serine (or cysteine) proteinase inhibito 




M640891 








AF050147 


Hs.97932 


chondramodulm 1 precursor 




AK001978 


Hs.98510 


similar to rab1 1 -binding protein 




AK000492 


Hs.98806 


hypothetical protein 




AW207748 


Hs.59115 










distal-less homeo box 5 






Hs.88678 






M927802 














420783 


AI659838 


Hs.99923 


lectin, galactoside-bindmg, soluble, 7 




AL045633 


Hs.44269 






AF044197 


Hs.100431 


small inducible cytokine B subfamily (Cy 


421002 


AF116030 


Hs.100932 


transcription factor 17 




M761198 








AI684808 


Hs,1 97653 










ESTs, Moderately similar to I38022 hypot 




NM_004689 


Hs.101448 














AA401369 










Hs. 189902 






H87879 


Hs. 102267 


lysyl oxidase 




BE539976 


Hs. 103305 


Homo sapiens mRNA; cDNA DKFZp434B0425 (F 




AA287203 


Hs.324728 








Hs.1 03982 


small inducible cytokine subfamily B (Cy 




AA291377 


Hs.50331 










solute carrier family 1 (glutamate trans 




BE302796 


Hs.1 05097 


thymidine kinase 1, soluble 




NM_004833 


Hs.1 051 15 


absent in melanoma 2 












AA312082 




GDNF family receptor alpha 1 




AL080121 




DKFZP564O0823 protein 




AF026692 


Hs.1 05700 




421574 


AJ 000152 


Hs,1 05924 


defensin, beta 2 


421582 


AI910275 




trefoil factor 1 (breast cancer, estroge 


421633 


AF121860 


Hs.106260 


sorting nexin 10 


421659 


NM 014459 


Hs.106511 


protocadherin 17 




H64092 


Hs.38282 


ESTs 


421753 


BE314828 


Hs.1 07911 


ATP-binding cassette, sub-family B (MOW 


421773 


W69233 


Hs.1 12457 


ESTs 




BE562088 


Hs.1 081 96 


HSPC037 protein 
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424046 AF027866 

424086 AI351010 

424098 AF077374 

424120 T80579 

424165 AW582904 
AA337221 

424279 L29306 

424308 AW975531 



Hs. 108660 
Hs.45107 
Hs.109643 
Hs.1440 
fs.334309 



422134 AW179019 

422158 L10343 

422168 AA586894 

422278 AF072873 

422282 AF019225 

422283 AW411307 

422310 AA316622 

422311 AF073515 
422330 D30783 
422364 AF067800 
422406 AF025441 
422424 AI186431 Hs.296638 
422440 NM_004812 Hs.116724 
422487. AJ010901 Hs.193267 
422511 AU076442 Hs.117938 
422515 AW500470 Hs.1 17950 
422656 AI870435 Hs.1569 
422737 M26939 Hs.1 19571 
422756 M441787 Hs.1 19689 
422765 AW409701 Hs,1578 
422809 AK001379 Hs.121028 
422867 L32137 Hs.1584 
422938 NU_001809 Hs.1594 
422956 SE545072 Hs.1 22579 
422960 AW890487 Hs.63984 
422963 M401369 Hs.1 90721 
422976 AU076657 Hs.1 600 
422981 AF026445 Hs.1 22752 
422986 AA319777 Hs.221974 
423034 AL1 19930 

423049 X59373 Hs.188023 



423184 NMJ04428 

423217 NM.000094 

423248 AA380177 

423309 BE006775 

423361 AW170055 

423453 AW450737 

423511 AF036329 

423516 AB007933 

423551 M327598 

423554 M90516 

423575 C18863 

423624 AI807408 

423634 AW959908 

423642 AW452650 

423662 AA642452 

423673 BE003054 

423698 AA329796 

423725 AJ403108 

423761 NM_O0S194 

423787 AJ295745 

423816 AF151064 

423826 U20325 

423849 AL157425 

423887 AL080207 
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Hs.1 624 
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Hs.1 2971 5 
Hs.129729 
Hs.233785 
Hs.1 674 
Hs.1 63443 



hypothetical protein FLJ 22704 
gastrin releasing peptide 
gb;QV0-OT0033-01 0400-1 82-a07 OT0033 Homo 
serine (oi cysteine) proteinase inhibito 
mitochondrial ribosomal protein L42 
protease inhibitor 3, skin-derived (SKAL 
31i )i il "'i Liiding protein A7(psorias 
frizzled (Drosophila) homolog 6 
apolipoprotein L 

CDC45 (cell division cycle 45, S.cerevis 
cytochrome P450, subfamily IIS, polypept 
cytokine receptor-like factor 1 
epiregulin 

C-type (calcium dependent, carbohydrate- 
Opa-interacting protein 5 
prostate differentiation factor 
aldo-keto reductase family 1, member B10 
mucin 4, tracheobronchial 
collagen, type XVII, alpha 1 
multifunctional polypeptide similar to S 
LIM homeobox protein 2 
collagen, type III, alpha 1(1" 
glycoprotein hormones, alp 
baculoviral IAP repeat-containing 5 (sur 
hypothetical protein FU 10549 
cartilage oligomeric matrix protein (pse 
centromere protein A (17kDJ 
ECT2 protein (Epithelial cell transform! 
cadherin 1 3. H-cadherin (heart) 



ESTs 

gb:DKFZp761A092_r1 761 (synonym: hamy2) 
ESTs, Moderately similar to HXDA_HUMAN H 
sperm associated antigen 4 
ephrin-A1 

collagen, type VII, alpha 1 (epidemiolys 
ribulose-5-phosphate-3-epimerase 



CGI-09 protein 

gonadotropin-releasing hormone 2 
ligand of neuronal nitric oxide synthase 
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heparin-binding growth (actor binding pr 

hypothetical protein MGC13204 

B-cell CLL/lymphoma 11A (zinc linger pro 
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Hs.1 53692 
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Hs.1 02267 
Hs. 139322 
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DKFZp434J181 3 protein 
hypothetical protein LOC57822 
paired box gene 9 
nuclear pore complex protein 
hypothetical protein 
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4.014479 Hs.1 45296 
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Homo sapiens mRNA; cDNA OKFZp 761 J 1324 (f 
DKFZP434G232 protein 
Forkhead box E1 (thyroid transcription f 
KIAA1 632 protein 

osteoblast specific factor 2 (fasciclin 
tumor protein 63 kDa with strong homolog 
hypothetical protein MGC1 5730 
Homo sapiens oDNA FLU 4354 lis, clone Y7 
serine (or cysteine) proteinase inhibito 
lysyl oxidase 

small proline-rich protein 3 
ESTs 

islet amyloid polypeptide 
gb:EST41944 Endometrial tumor Homo sapie 
tryptophan hydroxylase (tryptophan 5-mon 
minichromosome maintenance deficient (S. 
disintegrin protease 
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Activin A receptor, type I (ACVR1) (ALK 
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gb:EST365190 MAGE resequences, MAGB Homo 
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ESTs, Weakly similar to 138022 hypotheti 
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hypothetical protein NUF2R 
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parathyroid hormone receptor 2 
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Homo sapiens mRNA' cDNA DKFZp566A1046 (f 
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UDP glyoosyltransferase 1 family, polype 








































































solute carrier family 1 2 (potassium/chlo 












S-adenosylhomocysteine hydrolase 
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BE296216 


Hs,1 72673 


1.51 


1.25 


426897 


AA401369 


Hs.1 90721 


ESTs 


141.56 


17.00 


426925 


NM.001196 


Hs.315689 


Homo sapiens cDNA: FLJ22373 ts, clone H 


32.51 


38.00 


426935 


NMJW0088 


Hs.172928 


collagen, type I, alpha 1 


2.65 


3.15 


426964 


AA393739 


Hs,287416 


Homo sapiens cDNA FLJ11439 fis, clone HE 


1.97 


3.49 


426966 


AI493134 




sclerostin 


1.00 




426991 


AK001536 




Homo sapiens cDNA FLJ10674 (is, clone NT 




2.28 


427099 


AB032953 


Hs.173560 


odd Oz/ten-m homolog 2 (Drosophila, mous 


4.24 
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428093 AW594506 

428098 AU077258 

428129 AI244311 

428169 AI928984 

428182 BE386042 

428227 AA321649 

428242 H55709 

428330 L22524 

428434 AI909935 

428450 NMJJ14791 



428479 Y00272 

428484 AF104032 

428505 AL035461 

428532 AF157326 

428645 AA431400 

428664 AK001666 

428698 AAB52773 

428728 NMJ16625 Hs.191331 

428748 AW593206 Hs,98785 

428758 AA433988 Hs.98502 

428771 AB028992 Hs.193143 

428801 AW277121 Hs.254881 

428810 AF068236 Hs.1 93788 

428839 AI767756 Hs.82302 

428845 AL157579 Hs.153610 

428959 AF1 00779 Hs,1 94680 

428969 AF120274 Hs.194689 

429038 AL023513 Hs.1 94766 

429065 AI753247 Hs.29643 

429164 AI688663 Hs.116586 

429170 NM.001394 Hs.2359 
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429220 AW207206 

429228 AI553633 Hs,326447 
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Hs.177936 
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Hs.178078 
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Hs.2256 
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429263 AA019004 Hs.1 98396 

429276 AF056085 Hs.1 98612 

429359 W00482 Hs.2399 

429412 NM.006235 Hs.2407 

429413 NM_01405B Hs.201877 
429486 AF1 55827 



429538 BE182592 

429547 AA401369 

429551 AW450624 

429563 BE619413 

429597 NM_003B16 Hs.2442 



429610 AB024937 

429612 AF062649 

429616 AI982722 

429656 X05608 



lectin, superfamily member 1 (cartllage- 
SPANX family, member C 
glutamate receptor, metabotropic 4 
26S proteasome-associated pad1 homolog 
small nuclear RNA activating complex, po 
minichromosome maintenance deficient (S. 
hypothetical protein FU231 88 
ESTs 

collagen, type X, alpha 1 (Schmid metaph 
Homo sapiens cDNA; FU23228 fis, done C 
calmodulin-Bke skin protein 
hypothetical protein FU14904 
FGFR1 oncogene partner 



solute carrier family 25 (mitochondrial 
ESTs 

hypothetical protein FLJ20116 

serine/threonine kinase 12 

tumor necrosis factor (ligand) superfami 

ESTs 

glutamate-cysteine ligase, catalytic sub 
Homo sapiens cDNA: FU23602 Us, clone L 
ESTs, Moderately similar to 138022 hypot 



leukemia inhibitory factor (cholinergic 



Hs.2281 
Hs.1 84786 
Hs.98729 
Hs.189095 
Hs.334838 



TGP-interacting protein 
ESTs, Weakly similar to 2017205A dihydro 
similar to SALL1 (sal (Drosophila)-]ike 
KIAA1866 protein 
hypothetical protein 
Ksp37 protein 

hypothetical protein FLJ14303 

KIAA1069 protein 

ESTs 

nitric oxide synthase 2A (inducible, hep 
Homo sapiens cDNA FLJ1 481 4 fis, clone NT 
KIAA0751 gene product 
WNT1 inducible signaling pathway protein 



KIAA0704 protein 
group-specific component (vitamin D bind 
gap junction protein, beta 5 (connexin 3 



ESTs, Highly similar to S60712 band-6-pr 
ATP-binding cassette, sub-family A (ABC1 
G protein-coupled receptor 51 



Hs.204238 
Hs.1 1251 
Hs,190721 
Hs.220931 
Hs,2437 



POU domain, class 2, associating factor 
DESC1 protein 
hypothetical protein FLJ10339 
lip alir > (oncogene 24p3) 
small proline-rich protein 2A 
ESTs 
ESTs 



eukaryotic translation initiation factor 
a disintegrin and metailoproteinase doma 
LUNX protein; PLUNC (palate lung and nas 
pituitary tumor-transforming 1 
ESTs 

neurofilament, light polypeptide (68kD) 
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429903 AL134197 Hs.93597 

429918 AW873986 Hs.119383 ESTs 

429978 M249027 



phospholfpase A2, group IVA (cytosolic, 
tumor necrosis factor receptor superfami 
Ras-GTPase-activating protein SH3-domain 
cyclln-dependent kinase 5, regulatory su 
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430044 AA464510 

430114 AA847744 

430134 BE380149 

430147 R60704 

430287 AW182459 

430294 AI638226 

430300 U60805 

430315 NM 004293 

430337 M36707 

430378 Z29572 



430439 AL133561 

430451 AA836472 

430454 AW469011 

430466 AF052573 

430481 AA479678 

430486 BE062109 

430508 AI015435 

430533 AA480895 

430563 AF146074 

430677 236317 

430678 AA401369 
430686 NM.001942 



430935 AW072916 

430985 M490232 

431009 BE149762 

431089 BE041395 

431092 AI332764 

431124 AF284221 

431164 AA493650 

431211 M86849 

431221 AW207837 

431277 M501806 



Hs.152812 

Hs,99640 

Hs.1 05223 

Hs,234434 

Hs.125759 

Hs.32976 

Hs.238648 

Hs.239147 

Hs.239600 

Hs,2556 

Hs.240770 

Hs.241305 

Hs.297939 
Hs.1 05635 
Hs.241517 
Hs.203269 
Hs.241551 
Hs.104637 
Hs.57749 
Hs.108660 



ESTs 
ESTs 

ESTs Weakly similar to T33138 hypotheti 
hairy/enhanoer-of-split related with YRP 
ESTs, Weakly similar to LEU5JHUMAN LEUKE 
guanine nxleotide binding protein 4 
oncostatin M receptor 
guanine deaminase 
calmodulin-like 3 

tumor necrosis factor receptor superfami 
nuclear cap binding protein subunit 2, 2 
estrogen-responsive B box protein 
DKFZP434B061 protein 

ESTs 



ESTs, Weakly similar to T17288 hypotheti 
ATP-binding cassette, sub-family C (CFTR 
desmoglein 2 



is.2633 desmoglein 1 

1s.71 79 ESTs, Weakly similar to 2004399A chromos 
is.2699 glypican 1 

zinc finger protein 131 (clone pHZ-10) 
is.27323 ESTs, Weakly similar to 178885 serine/th 
is.48956 gap junction protein, beta 6 (connexin 3 
ESTs, Weakly similar to unknown protein 
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432375 BE536069 
432407 AA221036 
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432543 AA552690 
432552 AI537170 
432583 AW023624 
432606 NMJ02104 
432625 AI243596 
432653 N62096 
432677 NMJ04482 
432715 AA247152 
432753 NM.014075 
432788 AA521091 
432842 AW674093 
432867 AW016936 
432917 NM.014125 



Hs.1 25757 
Hs.59506 
Hs.94367 
Hs.323733 
Hs.286145 
Hs.345824 

Hs.21659 
Hs.285026 
Hs.256311 
Hs.298312 



doublesex and mab-3 related transcriptio 
Homo sapiens cDNA: FU23494 fis, clone L 
gap junction protein, beta 2, 26kD (conn 
SRB7 (suppressor of RNA polymerase B, ye 



gb:EST382704 MAGE resequences, MAGK Homo 
ESTs 

gb:MR2-HT0377-150200-202-e03 HT0377 Homo 
granln-like neuroendocrine peptide precu 
lyp thclioal protein DKFZ P 434A1315 



Hs.271387 

Hs.271580 

Hs.271986 

Hs.272214 

Hs.2877 

Hs.272320 

Hs.236223 

Hs.298241 

Hs.273330 

Hs,273558 

Hs.2936 

Hs.285753 

Hs.274263 

Hs.274419 

Hs.301885 



Hs.1 63484 
Hs.207530 
Hs.1 52423 
Hs.1 73725 
s.1 62282 
S.3066 
s.94830 
Hs.293185 
Hs.278611 
Hs.200483 
Hs.336938 
Hs.1 78499 
Hs.334822 
Hs.233364 
Hs,241517 



irtegrin, alpha 2 {CD49B, alpha 2 subuni 
STG protein 

cadherin 3, type 1, P-cadherin (placenta 
Homo sapiens mRNA; cDNA DKFZp434L1 226 (f 
EST 

Transmembrane protease, serine 3 

Homo sapiens, clone IMAGE:3544662, mRNA, 

phosphate cytidylyltransferase 1, cholin 



SCG10-II 



hypothetical protein FU10377 
hypothetical protein FU1 0244 
Homo sapiens cDNA FLJ1 1 346 fis, clone PL 
S100 calcium-binding protein P 
gb:zr03f12.r1 Stratagene NT2 neuronal pr 
ESTs 
ESTs 

Homo sapiens cDNA: FU21274 fis, clone C 
ESTs Weakly similar to ALU8.HUMAN ALUS 
potassium channel TASK-4; potassium chan 
granzyme K (serine protease, granzyme 3; 
ESTs, Moderately similarto T03094 A-kin 
ESTs, Weakly similar to JC7328 amino aci 
UDP-N-acetyl-alpha-D-galactosarnlnetpolyp 
ESTs, Weakly similar to KIAA1074 protein 
Homo sapiens PRO0593 mRNA, complete cds 
Homo sapiens cDNA: FU231 17 fis, done L 
hypothetical protein MGC4485 
ESTs 

PRO0327 protein 
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U37689 
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polymerase (RNA) II (DNA directed) polyp 


1,44 


1.30 




AF217513 


Hs.279905 


clone HQ0310PRO0310p1 
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85.64 


433023 


AW864793 


Hs.37409 


thrombospondin 1 


20.96 


100.00 




AW193534 


Hs.281895 


Homo sapiens cDNA FU11660 fis, clone HE 


1,00 


10.00 


433091 


Y12642 


Hs.3185 


lymphocyte antigen 6 complex, locus D 


1,20 


1,09 


433159 


AB035898 


Hs,150587 


klnesin-like protein 2 


13,82 


39.00 


433183 


AF231338 


Hs.222024 


transcription factor BMAL2 


1,00 


69.00 


433258 


AA622788 


Hs.203613 


ESTs, Weakly similar to ALUBJHUMAN 111! 


1.00 


1,25 


433409 


AI278302 


Hs.25661 


ESTs 


44,81 


117,00 


433437 


U20536 


Hs.3280 


caspase 6, apoptosis-related cysteine pr 




105.00 


433485 


AI493076 


Hs.201967 


aldo-keto reductase family 1, member C2 


11,55 


2,00 


433537 


AI733S92 


Hs.1 12488 


ESTs 


3.66 


55,00 


433547 


W04978 


Hs.303023 


beta tubulin 1, class VI 


25,16 


83,00 


433556 


W56321 


Hs.111460 


caloium/calmodulin-dependent protein kin 


1.00 


19.00 


433647 


M603367 


Hs.222294 


ESTs 


20.30 


49.00 


433658 


L03678 


Hs.156110 




5,92 


10.03 


433800 


AI094221 


Hs.135150 


lung type-l cell membrane-associated gly 


2,29 


2.22 


433819 


AW511097 


Hs.112765 


ESTs 


3,71 


3,00 


433862 


D86960 


Hs.3610 


KIAA0205 gene product 


62.08 


104.00 




AA137152 


Hs.286049 


phosphoserine aminotransferase 


108.91 


47,00 




AF1 16677 


Hs.249270 


/potheticali >- c )1 


1,00 


1,00 
87.00 


434094 


AA305599 


Hs.238205 


hypothetical protein PRO201 3 


121,27 


434105 


AW952124 


Hs.13094 


presenilins associated rhomboid-like pro 


1,22 


1,23 


434217 


AW014795 


Hs.23349 


ESTs 


14,11 


57,00 


434340 


AH 93043 


Hs.123685 


ESTs, Weakly similar to T17226 hypotheti 


2.10 


2,56 


434360 


AA401369 


Hs.1 90721 


ESTs 


40,98 


17,00 


434414 


AI798376 




gb:tr34b07.x1 NCI_CGAP_Ov23 Homo sapiens 


1,48 


1,56 


434424 


AI811202 


Hs.325335 


Homo sapiens cDNA; FU23523 (is, clone L 


1.00 


54,00 




BE552368 


Hs.231853 


Homo sapiens cDNA FLJ13445 fis, clone PL 


54.91 


85,00 


434551 


BE387162 


HS.2B0858 


ESTs, Highly similar to A35661 DNA excis 
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2,00 


434627 


AI221894 


Hs.3931 1 


ESTs 


1.00 


1,00 


434699 


AA643687 


Hs.149425 


Homo sapiens cDNA FLJ11980 fis, clone HE 


1,00 


23,00 


434769 


M648884 


Hs.1 34278 


Homo sapiens cDNA FU1 2676 fis, clone NT 


7 05 


56,00 


434792 


AA649253 


Hs.1 32458 


ESTs 


3,52 


44.00 




AF155108 


Hs.256150 


Homo sapiens, Similar to RIKEN cDNA 281 0 


11.33 


1.00 


434828 


D90070 


Hs.96 


phorbo 12-myristate 13-a etate ran ed 


1.00 


1,00 


434876 


AF160477 


Hs.61460 


Ig superfamily receptor LNIR 


1,25 


1,29 




M814309 




ESTs 


1,00 


5.00 


434928 




Hs.4267 


Homo sapiens clones 24714 and 2471 5 mRNA 


1.00 


1,00 


435013 


H91923 


Hs.1 10024 


Target CAT 


1.26 




435066 


BE261750 


Hs.4747 


dyskeratosis congenita 1, dyskerin 


1,69 


1,37 


435087 


AW975241 


Hs.23567 


ESTs 




1,00 


435099 


AC004770 


Hs.4756 


flap structure-specific ondonucleaso 1 


2,90 




435159 


AA668879 


Hs.1 16649 


ESTs 


1,00 


1,00 


435205 


X54136 


Hs.1 81 125 


immunoglobulin lambda locus 


1,02 


1,46 


435232 


NM 001262 


Hs.4854 


cydin-dcpcndent kinase inhibitor 2C (p1 


2,04 


2,70 


435304 


H10709 


Hs.269524 


ESTs 


27.58 


139.00 


435313 


AI769400 


Hs.1 89729 


ESTs 


1.00 


14.00 


435505 


AF200492 


Hs.211238 


interleukin-1 homolog 1 


1,00 


33,00 


435509 


AI458679 


Hs.181915 


ESTs 


1.00 


1.00 


435525 


AI831297 


Hs.123310 


ESTs 


1.00 


56.00 


435532 


AW291488 


Hs.1 17305 


Homo sapiens, clone IMAGE:3682908, mRNA 


1.00 


2.00 




AI224456 


Hs.324507 


H. sapiens polyA site DNA 


3,42 


3,92 






Hs.283532 


uncharacterized bone marrow protein BM03 


3.95 


1.80 


435766 


R11673 


Hs.1 86498 


ESTs 


1,00 


23,00 




AB037734 


Hs.4993 


KIAA1313 protein 


23.68 


42,00 


436069 


AI056879 


Hs.253209 


ESTs 


1.00 


58.00 


436170 


AW450381 


Hs.1 4529 


ESTs 


1,00 


13,00 


436211 




Hs.334828 


hypothetical protein FU10719; KIAA1794 


5.84 


22,00 


436213 


AA325512 


Hs.71472 


hypothetical protein FU10774; KIAA1709 
fibrinogen-like 1 


1.42 




436217 


T53925 


Hs.1 07 


57.97 


31,00 


436238 


AK0021 63 


Hs.301724 


hypothetical protein FU1 1301 


2.51 
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nucleolar protein (KKBO repeat) 


2.33 


1,64 


436291 


BE558452 


Hs.344037 


protein regulator of cytokinesis 1 


108,99 


52,00 




AL355841 


Hs.99330 


hypothetical protein FU23588 


0.75 


2,81 




AW992292 


Hs.152213 


wingless-type MMTV integration site (ami 


30 01 


1,00 




BE264633 


Hs.1 43638 


WD repeat domain 4 


2.50 


2,19 




AI948626 


Hs.171356 


ESTs 
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1,33 


436443 


AW1 38211 


Hs.1 28746 


ESTs 
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AJ270693 


Hs.199887 


ESTs 


100 


1,00 




AA379597 


Hs.5199 


HSPC150 protein similar to ubiquitin-con 


3.28 


1,56 




AA742221 


Hs.120633 


ESTs 


1.00 


19,00 


436511 
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H 291 


ESTs 


13.76 


14,00 


436553 


X57809 


Hs.181125 


immunoglobulin lambda locus 
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436557 


W15573 


Hs.5027 


ESTs, Weakly similar to A47582 B-cell gr 


19.20 


9.75 


436608 


AAS28980 




down syndrome critical region protein DS 


33,92 


25,00 


436667 


AW025183 


Hs.1 27680 




0.89 


1,19 


436771 


AW975687 


Hs.292979 


ESTs 


1.00 


10,00 


436839 


AA401359 


Hs.190721 


ESTs 


1.00 




436887 


AW953157 


Hs.193235 


hypothetical protein DKFZp547D1 55 


1.06 




436944 


AW268614 


Hs.5840 


ESTs 


1.00 


i!oo 


436961 


AW375974 


Hs.156704 


ESTs 


25.13 


25,00 


436972 


AA284579 


Hs.25640 




1.59 




437016 


AU076916 


Hs.5398 




2.35 


178 


437044 


AL035864 


Hs.69517 


cDNA for differentially expressed C01 6 g 


1.34 
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Hs.1 25343 


ESTs, Weakly similar to KIAA0758 protein 


1.00 


17.00 




AL1 10216 


Hs.22826 


ESTs Weakly similar to 155214 alivar) 


40.55 


82.00 






Hs.279243 


Homo sapiens mRNA; cDNA DKFZp564D2071 (f 


1.00 


112.00 










1.00 


205.00 






Hs!323769 


cispietin resistance related protein CRR 


1,56 


1.54 








Homo sapiens mRNA; cDNA DKFZp5660134 (fr 


113.25 


125.00 




AL359567 


Hs.161962 


Homo sapiens mRNA; cDNA DKFZp547D023 (fr 


1.82 


4.57 




AI125859 


Hs. 112607 


ESTs 


1,35 


1.75 




BE069288 


Hs. 34744 


Homo sapiens mRNA; cDNA DKFZp547C136 (fr 




3.20 




AI30B152 


Hs.27027 


v/pothetical pi in CI h f 2H1311 


3,03 


1.08 




H46008 


Hs.31518 


ESTs 


1,00 


39.00 


437568 


AI954795 


Hs. 155135 


ESTs 


1,00 


19.00 




D63880 




chromosome condensation-related SMC-asso 


1.95 


1.57 




AI581344 


Hs.127812 


ESTs, Weakly similar to T17330 hypothcti 


1,00 


3.00 




AI088192 




ESTs, Weakly similar to DDX9.HUMANATP-D 


1,00 


45,00 


437840 


AA884836 


Hs!292014 


ESTs 


1.07 






BE001836 


Hs.256897 


ESTs, Weakly similarto dJ365012.1 [H.sa 




326 




BE262082 




hypothelica! protein FLJ10305 


T87 


2.52 






Hs202312 


Homo sapiens clone N11 NTera2D1 teratoca 


74.05 


35,00 


437916 


BE566249 


Hs.20999 


hypothetical protein FU23142 


23,15 


89.00 






Hs.121655 


ESTs 


1.00 


1.00 




AI888256 


Hs.307526 




12.28 


31.00 




AW373062 




nuclear receptor subfamily 1, group i, m 


1.S3 


10.85 








ESTs 


1.80 


2.39 


438119 


AW963217 


Hs.203961 


ESTs, Moderately similar to AF1 16721 89 


22,67 


36.90 




AI918906 


Hs.55080 


ESTs 


1.00 


1.00 




AW970529 


Hs.86434 


hypothetical protein FLJ21816 


38.92 


38.00 


438403 


AA806607 


Hs.292206 


ESTs 


1,00 


1.00 




AA908678 


Hs.130183 


ESTs 


2,05 


80.00 




AW297204 






1,00 


131.00 




AJ24S82Q 




type I transmembrane receptor (sei2ure-r 


1,43 


1.45 




AI879064 


Hs.54518 


ESTs 


1.00 


34.00 




AW612553 


Hs.1 14670 


Human DNA sequence from clone RP11-16L21 


1.33 


1.10 




AI885815 


Hs.1 84727 


Human melanoma-associated antigen p97 (m 


2,42 


1.59 




NM 003787 




nucleolar protein 4 


1.00 


18,00 




AA826425 






2.03 


2,57 






Hs.1 84987 


ESTs 


6,42 


88.00 










22.41 


17.00 






Hs.285681 


Williams-Beuren syndrome chromosome regi 


1,00 


1.00 






Hs.1 35056 


Human DNA sequence from clone RP5-850E9 


2,20 


1.88 




AW979121 




gb:EST39l231 VAGE resequences, MAGP Homo 


278 


4.81 




M745978 


Hs.28273 


1.17 












1.00 


28.00 


439128 


AI949371 


Hs.1 53089 


ESTs 


1.0C 


67.00 




AW138909 


Hs.1 56 1 10 


immunoglobulin kappa constant 


1.38 






AW238299 


Hs.250618 


LiL 16 binding protein 2 


1.93 


1.64 




AL133916 




hypothetical protein FU20093 


45.53 


139.00 




AW837046 


Hs.8527 


G protein-coupled receptor 56 


2,00 


2.20 


439343 


AF086161 


Hs.1 14611 


hypothetical prolein FLJ11808 


6.10 


7.37 


439394 


AA401369 


Hs.1 90721 


ESTs 


3,39 


17.00 


439410 


M532012 


Hs.1 88746 




1,83 


3.07 


439451 


AF086270 


Hs.278554 


heterochromatin-like protein 1 


23,28 


52.00 




AA918317 


Hs.57987 


B-cell CLUlymphoma 1 1 B (zinc finger pro 


13.76 


122.00 




BE264974 


Hs.6566 


thyroid hormone receptor interactor 13 


2,78 


1.58 






Hs.58042 


ESTs, Moderately similar to GFR3_HUMAN G 


1.22 


1.44 




AF086310 


Hs.103159 


ESTs 


7.46 


39.00 






Hs.1 85029 


ESTs 


1,00 


1.19 






Hs.58399 




1.00 


1.00 






Hs.58561 


G protein-coupled receptor 87 


33,61 


1.00 




AF088076 


Hs.59507 


ESTs. Weak!/ similar to AC004858 3 U1 sm 


1.00 


1.00 


439702 


AW085525 


Hs.1 341 82 




4.30 


10.00 




AW872527 


Hs.59761 


ESTs, Weakly similar to DAP1JHUMAN DEATH 


36,55 


11.00 




BE246502 




sema domain, immunoglobulin domain (Ig) 


2,36 


1.88 




AL359053 


Hs.57664 


Homo sapiens mRNA full length insert cDN 


2.0? 


6.08 




AL359055 


Hs.87709 


Homo sapiens mRNA full length insert cDN 


1,00 


21.00 








gb'.Homo sapiens mRNA full length insert 


7,27 


25.00 




AW449211 


Hs.1 05445 


GDNF family receptor alpha 1 


1.00 


1.00 


439926 


AW014875 


Hs.1 37007 


ESTs 


32,58 


71.00 


439963 


AW247529 




platelet-activating factor acetylhydrola 


21.28 


9,55 




AW600291 




hypothetical protein FLJ10430 


58.83 


61,00 


440006 


AK000517 


Hs.6844 


hypothetical protein FLJ2051 0 


1,83 


4,02 




AW473675 


Hs.1 25843 


ESI «. kV oirn'lar to T17227 hypotheti 


1,42 


2,54 


440106 


AA864968 


Hs.1 27699 


KIAA1603 protein 


1.00 


54.00 




AB033023 


Hs.318127 


hypothetical prolein FLJ10201 


24,18 


52.00 




AI805392 


Hs.325335 


Homo sapiens cDNA: FLJ23523 lis, clone L 


3.21 


4.72 




AW450991 


Hs.1 9; 1 1 




33.63 


113.00 




NM 003812 




a disintegrin and metalloproteinase doma 


52.88 


147.00 


440492 


R39127 


H&21433 


hypothetical protein DKFZp547J036 


2.35 


3,62 


440527 


AVB57117 


Hs.184164 


ESTs, Moderately similar to S65657 alpha 


10.84 


57,00 


440659 


AF134160 


Hs.7327 




3.18 


2,37 


440704 


M69241 


Hs.1 62 


insulin-like growth factor binding prate 




2.09 


440943 


AW082298 


Hs.1 481 51 


hypothetical protein MGC2408 


202 


1.41 


440994 


AI160011 


Hs.272068 


ESTs 


1.29 


1,14 


441020 


AA401369 


Hs.1 90721 


ESTs 


142.99 


17,00 


441031 


AI110684 


• Hs.7645 


fibrinogen, B beta polypeptide 




99,00 
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M570255 




ESTs, Weakly similar lo T23273 hypolheti 






Hs.89605 


cholinergic receptor, nicotinic, alpha p 




BE614410 


Hs.23044 


RAD51 (S. cerevisiae) homolag (E coli Re 




BE218239 


Hs.202656 


ESTs 


441390 


AI692560 


Hs. 131 175 


ESTs 




R51064 


Hs.23172 


ESTs 


441525 


AW241867 


Hs.127728 


ESTs 


441553 


AA281219 


Hs.121296 


ESTs 




NM_005010 


Hs.7912 


neuronal cell adhesion molecule 


441533 


AW958544 


Hs.1 12242 


normal mucosa of esophagus specific 1 




AA081846 


Hs.7921 


Homo sapiens mRNA; cDNA DKFZp566E1 83 (fr 






Hs.7957 


adenosine deaminase, RNA-specific 




AA401369 


Hs.190721 


ESTs 


441801 


AW242799 


Hs.86366 


ESTs 




A1553802 




ESTs 






Hs.22279 


ESTs 




AI744935 




Fanconi anemia, complementation group G 




AW887434 




CDA11 protein 




AW956698 




neural precursor cell expressed, develop 




AI740832 




Homo sapiens clone 23570 mRNA sequence 




AW452649 


Hs.166314 






AW664964 


Hs.1 28899 


ESTs 




AA977235 


Hs.1 28830 


ESTs, Weakly similar to Z192JHUMAN ZINC 


442159 


AW1 63390 


Hs.278554 






AA983842 


Hs, 333556 


chromosome 2 open reading frame 2 




A1952430 


Hs.150614 


i 3 Ve kl> similar to ALU4 HUMAN ALUS 




BE093589 




hypothetical protein FLJ23468 


442530 


AI580830 


Hs.176508 


Homo sapiens cDNA FLJ14712 5s, clone NT 


442547 


AA306997 


Hs.217484 


ESTs \c.kly imi arte ALULHUMAN ALUS 


442556 




Hs.8379 


Homo sapiens mRNA; cDNA DKFZp585L2424 (f 




AA447492 


Hs.20183 


ESTs, Weakly similar to AF1 64793 1 prole 




AI015631 


Hs.23210 


ESTs 




R88362 




ESTs, Weakly slmllarto T23976 hypolheti 


442875 


BE623003 


Hs.23625 


H c - pici slone I i 1 00142 mF IA equ 




AW1 88551 


Hs.99519 


hypothetical protein FLJ14007 




M457211 




bromodomain adjacent to zinc finger doma 




AW1 67087 


Hs.131562 


ESTs 




AI1 88710 








AW205878 




Homo sapiens cDNA FLJ13103 fts, clone NT 




AI1 28388 




ESTs 


443247 


BE614387 


Hs.333893 


c-Myc target JP01 


443324 


R44013 


Hs.164225 


ESTs 


443383 


AI792453 


Hs.166507 


ESTs 


443400 


R28424 


Hs.250648 


ESTs 




AF098158 


Hs.9329 


chromosome 20 open reading frame 1 


443572 


AA025610 


Hs.9605 


cleavage and polyadenylation specific fa 




AI078022 


Hs.269636 


ESTs. Weakly similar I 1 HUM N ALU S 








fibrinogen, B beta polypeptide 




AL031290 


Hs.9654 


similar to pregnancy-associated plasma p 




AI085377 


Hs.143610 


ESTs 




AI583187 


Hs.9700 






Al 144442 




syntaxin 6 


443802 


AW504924 


Hs.9805 


KIAA1291 protein 


443859 


NM 013409 


Hs.9914 


follistatin 




M401369 


Hs.190721 


ESTs 




W24187 




gb:zb47f09.M SoaresJetalJung_NbHL19W 


443991 


NMJ02250 


Hs.10082 


potassium intermediate/small conductance 




BE395085 


Hs.10086 


type 1 transmembrane protein Fn14 




AI380792 


Hs.135104 






U04840 


Hs.214 


neuro-oncological ventral antigen 1 


444127 


N63620 


Hs.1 3281 


ESTs 




AW294292 


Hs.256212 








Hs.89605 


cholinergic receptor, nicotinic, alpha p 




BE540274 




forkhead box M1 






Hs.1 2569 






BE387335 


Hs.283713 


ESTs, Weakly similar to S64054 hypotheti 


444461 




Hs.25978 


ESI W( I milar to 21092S0A B ceil 




AB020684 




ESTs" P 




AI151010 








BE538082 




ESTs, Moderately similar to A46010 X-lln 








B aggressive lymphoma gene 




AM 8861 3 


Hs41690 


desmocollin 3 




BE019923 


Hs, 2431 22 


hypothetical protein FU13057 similar to 




NM_014400 


Hs.1 1950 


GPl-anchored metastasis-associated prote 


444783 


AK001468 


Hs.62180 


3 I Drc i Scraps homol act 




AK001676 




hypothetical protein FU10814 


445258 


AIS35931 


Hs!l47613 


ESTs 


445413 


AA151342 


Hs.1 2677 


CGI-147 protein 




AK001058 


Hs.1 2580 


Homo sapiens cDNA FU10196 fis, clone HE 


445443 


AV653838 


lls.322971 


ESTs 




AA378776 


Hs.288649 


hypothetical protein MGC3077 




AF208855 


Hs.1 2830 


hypothetical protein 


445537 


AJ245671 


Hs.1 2844 


EGF-like-domain, multiple 6 


445580 


AF1 67572 


Hs.1 291 2 


skbl (S, pombe) homolog 


445654 


X91247 


Hs.13046 


thioredoxin reductase 1 
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445669 


AI570830 


Hs.174870 


ESTs 


10.95 


11.45 


446818 


BE045321 


Hs.136017 


ESTs 


1.00 


1.00 


446873 


AA250970 


Hs.251946 


poly(A)-binding protein, cytoplasmic 1-1 


49.42 


54.00 




AI734009 


Hs.1 27699 


KIAA1603 protein 


1.00 


132.00 


445898 


AF070623 


Hs.13423 


Homo sapiens clone 24468 mRNA sequence 


1.00 


1.00 


445903 


AI347487 




class ! cytokine receptor 


1.00 


36.00 




BE046441 


Hs.333555 


Homo sapiens clone 24859 mRNA sequence 


2.41 


2.88 


445982 


BE410233 


Hs.13501 


pescadillo (zebrafish) homolog 1, oontai 


1.60 


1.35 


446078 


AI339982 


Hs.156061 


ESTs 


1.00 


42.00 


446102 




Hs.317694 


ESTs 


1.00 




446157 


BE270828 


Hs.131740 


Homo sapiens cDNA: FLJ225S2 fis, clone H 


1.70 




446269 


AW263155 


Hs.14559 


hypothetical protein FLJ 10540 


73.01 


43.00 


446292 


AF081497 


Hs.279682 


Rh type C glycoprotein 


1.55 


1.26 


446293 


AI420213 


Hs.149722 


ESTs 


1.00 


2.00 


446423 


AW139655 


Hs.150120 


ESTs 


1.10 


4.19 


446428 


AW082270 


Hs.12496 


ESTs, Weakly similar to ALU4JHUMAN ALU S 


0.53 


3.26 


446432 


AI377320 


Hs.150058 


ESTs 


1.00 


5.00 


446528 


AU076640 


Hs.15243 


nucleolar protein 1(1 20kD) 






446574 


AI310135 


Hs.335933 


ESTs 


3.89 


72.00 


446619 


AU076643 


Hs.313 


secreted phosphoprotein 1 (osteopontin, 


32.03 


20.23 


446636 


AC002563 


Hs.15767 


citron (rho-interacting, serine/threonin 




5.07 


446783 


AW1 38343 


Hs. 141 867 


ESTs 


282 


9.47 




BE091926 




mitotic spindte coiied-coil related prot 


110.28 


23.00 


446849 


AU076617 


Hs.16261 


cleavage and polyadenylation specific fa 


3.26 


2.94 


446856 


AI814373 


Hs. 1641 75 


ESTs 


6.38 


11.30 


446872 


X97058 


Hs.16362 


pyiimioirtergic receptor P2Y, 6-pratein c 


1.98 


2.03 


446880 


AI811807 


Hs.1 08646 


Homo sapiens cDNA FLJ14934 fis, clone PL 


S4.90 


113.00 


446921 


AB012113 


Hs.16530 


small inducible cytokine subfamily A (Cy 


1,67 


3.90 




AK001898 


Hs.16740 


hypothetical protein FLJ11036 


2,82 


3,12 


447022 


AW291223 


Hs.1 57573 


ESTs 


1.00 


170.00 


447033 


AI357412 


Hs.157601 


ESTs 


7.16 


107.00 


447078 


AW885727 




ESTs 


47,24 


24,00 


447081 


Y13896 


Hs.1 7287 


potassium inwardly-rectifying channel, s 




17.88 


447131 


NM 004585 


Hs.1 7466 


retinoic acid receptor responder (tazaro 


0,97 




447149 


BE299857 


Hs.326 


TAR (HIV) RNA-binding protein 2 


1,24 


1,26 




AA805202 


Hs.315562 


ESTs 


1.00 


54.00 




AFQ26941 


Hs.17518 


Homo sapiens cig5 mRNA, partial sequence 


1.00 


67.00 


447178 


AW594641 


Hs.192417 


ESTs 


3.42 


50.00 


447250 


AI878909 


Hs.17883 


protein phosphatase 1 G (formerly 2C), ma 


1.60 


1.52 


447289 


AW247017 


Hs.36978 


melanoma antigen, family A, 3 


1.00 




447342 


AI1 99268 


Hs.1 9322 


Homo sapiens, Similar to RIKEN cDNA 2010 


5833 


1.00 


447343 


AA256641 


Hs.236894 


ESTs, Hig hly similar to S02392 alpha-2-m 


146.62 


51.00 


447350 




Hs.172634 




1.00 


12.00 


447377 


N27687 


Hs.334334 


transcription factor AP-2 alpha (actlvat 


2.55 


63.00 


447415 


AW937335 


Hs.28149 


ESTs, Weakly similar to KF3BJHUMAN KINES 


0.91 


1.13 


447425 


AI963747 


Hs.1 8573 


acylphosphaiase 1, erythrocyte (common) 


1.00 


35.00 




U46258 


Hs.339665 


ESTs 




49.00 


447532 


AK000614 


Hs.1 8791 


hypothetical protein FU20607 


1.23 


1.63 


447534 


AA401369 


Hs.190721 


ESTs 


1.00 


17.00 


447636 


Y10043 




high-mobility group (nonhistone chromoso 


1.41 




447688 


N87079 


Hs.19236 


Target CAT 


1.00 


39.00 


447733 


AF157482 


Hs.1 9400 


MAD2 (mitotic arrest deficient, yeast, h 


1.17 


1.12 


447769 


AW873704 


Hs.320831 


Homo sapiens cDNA FU14597 fis, clone NT 


6.47 


S.9S 


447802 


AW593432 


Hs.161455 


ESTs 


0.73 


2.34 


447850 


AB018298 


Hs.1 9822 


SEC24 (S. cerevisiae) related gene famil 


86.45 


116.00 


447924 


AI817226 


Hs.313413 


ESTs, Weakly similar to T231 1 0 hypotheti 


1.00 


1.00 


447973 




Hs.20141 


similar to S. cerevisiae SSM4 


3.50 


4.27 


448030 




Hs.325960 


membrane-spanning 4-domains, subfamly A 


4.13 


142.00 




AI538613 


Hs.298241 


Transmembrane protease, serine 3 


1.15 


2.24 


448243 


AW369771 


Hs.52620 


integrin, beta 8 


15.84 


1.00 
1.90 


448278 


W07369 




ESTs 


0.97 


448290 


AK002107 


Hs.20843 


Homo sapiens cDNA FU1 1 245 fis, clone PL 


1.00 


1.00 


448296 


BE622756 


Hs.1 0949 


Homo sapiens cDNA FU14162 fis, clone NT 


2.42 


2.17 


448357 


BE274396 


Hs.108923 


RAB38, member RAS oncogene family 


1.44 


1.08 


448390 


AL035414 


Hs.21068 


hypothetical protein 


1.00 




443469 


AW504732 


Hs.21275 


hypothetical protein FLJ 1 101 1 


2.63 


2.49 


448569 


BE382657 


Hs.21486 


signal transducer and activator of trans 




2.53 


448663 


BE614599 


Hs.1 06823 


hypothetical protein MGC14797 


33 


46.00 


448672 


AI955511 


Hs.225106 




1.00 


21.00 


448733 


NM 005629 


Hs.1 87958 


solute carrier family 6 (neurctransmifte 


1.82 


1.08 


448741 


BE614567 


Hs.1 9574 


hypothetical protein MGC5469 


2.43 


1,92 




AI366784 


Hs.48820 


TATA box binding protein (TBP)-associate 


23.53 


20.00 


448775 


AB025237 


Hs.388 


midix (nucleoside diphosphate linked moi 


2.34 


1,97 


448826 


AI580252 


Hs.293246 


ESTs, Weakly similar to putative p150 [H 


74,07 


62.67 


448830 


AL031658 


Hs.22181 


hypothetical protein dJ310O13.3 


1.37 


1.31 


448844 


AI581519 


Hs.177164 




1.00 


31.00 


448988 


Y09763 


Hs.22785 


gamma-aminobutyric acid (GABA) Arecepto 


1.84 


1.95 


448993 


AI471630 




KIAA0144 gene product 


1.63 


1.49 


449003 


X76342 


Hs.389 


alcohol dehydrogenase 7 (class IV), mu o 






449029 


N28989 


Hs.22891 


solute carrier family 7 (cafionic amino 


157 


2^25 


449040 


AF040704 


Hs.1 49443 


putative tumor suppressor 


0.97 


1.55 


449048 


Z45051 


Hs.22920 


similar to S68401 (cattle) glucose indue 


27.13 


90.00 


449053 


AI625777 


Hs.344766 


ESTs 


8.33 


44.00 


449054 


AF148848 


HS22934 


myoneurin 


73.85 


104.00 


449101 


AA205847 


Hs.23015 


G protein-coupled receptor 


2.58 


27.00 
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KIAA1 694 protein 










Hs.23255 










AJ 4031 07 












BE613348 


Hs.21 1579 














gb:tt09b07.x1 NCLCGAP.GC6 Homo sapiens 


17.28 


45,00 




AW236021 




Homo sapiens, Similar to RIKEN cDHA 5730 




















AW205006 


Hsll 97042 




1.00 


100 




NM 000579 




















1,00 
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AI768938 AI569996 AI452952 AI168582 AI1 89869 AI086670 AW262550 AW51 3854 AA862839 AA435840 AA670197 AI024032 AI990659 
AI990089 N81095 AA847919 AW9601 50 AA21 1 075 AAO44704 AA367594 AW582587 AW85B854 AWB18630 AW818281 AW818433 AW582595 
AA096002 NB3992 

AI471630 BE540637 BE265481 AW407710 BE513882 BE546739 AA053597 BE140503 BE218514 AW956702 AI656234 AI636283 AI567265 
AW340858 BE207794 AA053085 R69173 AA292343 AA454908 AA293504 AI659741 AI927478 AA399460 AI760441 AA346416 BE047245 
AA730380 AA394063 AA454833 AI982791 AI567270 A1813332 AI757858 AA427705 D20284 AI221458 BE048537 AI263048 AA346417 
AA911497 BE537702 



AI761324AW8B0941 AW880937 

AW1 18072 AI631982 T15734 AA224195 AI701 458 W201 98 F25326 AA890570 N90552 AW071907 AI671352 AI375892 T03517 R88265 
AI124088 AA224388 AI08431 6 AI354686 T33652 Al 1 4071 9 AI72021 1 T03490 AI372637 T1541 5 AW205836 AA630384 T03515 T33230 
AA01 71 31 AA443303 T33623 AI222556 T3351 1 T33785 AI41 9606 D5561 2 

W52854 AL1 17600 BE2081 1 6 BE208432 BE206239 BE082291 AW953423 AA351 61 9 BE1 80648 BE140560 W60080 AA86547B N90291 
AW450652 AW449519 AA993634 AI806539 AA351618 AW449522 AI827626 AA904788 AA380381 AA886045 AA774409 BE003229 Z41756 
AL1 33619 AA4681 1 8 AA383064 AI476447 T09430 AI673758 AA524895 AI581345 AI300820 AW498812 AA2561 62 AI559724 AI685732 
AA602400AA905453AI204595AW1 66541 AA157456AA156269AA363652 AA431072 AW592707 AI435410 AW2724S4 AI215594 AAB22747 
R74039 N35031 AI804128 AW513621 AA868351 AI026826 A1493388 AA614641 W81604 AI5S7080 A1214351 AA730140 AI125754 AI2Q0813 
AI269603 AI565082 AI807095 AI476629 AA505909 A1368449 AI686077 AI582930 AW085038 AA757863 AA7301 54 AI767072 AA46831 6 
AI734130 AI734138 AA426284 AA433997 A1741241 AW043563 AI732741 AI732734 AA437369 AA425820 AA664048 R74130 
BE144666 BE184942 AW238414 BE184946 
AW993247AW861464 
AA203682 R11958 

BE550224 AA832519 N45402 AW685857 N29245 BE455409 W07677 AW970089 AI299731 AA482971 BE503548 H18151 W79223 AF086393 
AA461301 W74510 R34182 AI09066S N46003 BE071550 R2S075 AW134982 Al 240204 AM 38906 AW026179 AI572316 BE466182 AI206395 
AI276154 AI273269 AI422817 AI371014 AI421274 A1188525 AA939164 BE549810 AW1 37865 AI694996 BE503841 AA459718 BE327407 
BE467534 BE218421 BE467767 AA989054 BE467053 AI7971 30 BE327781 



e. The 7 digit numbers in this column are Genbank Identifier (Gl) numbers, 
nee of human chromosome 22." Dunham I. et al., Nature (1999) 402:489-495. 
;es DNA strand from which exons were predicted. 
;es nucleotide positions of predicted exons. 



"Dunham I. etal." refers to the publication entitled The DNA 



400512 
400517 
400560 
400664 
400665 
400666 
400749 
400763 
401027 
401093 
401203 
401212 
401411 
401435 



401760 
401780 
401781 
401785 
401797 
401961 
401985 



8118496 
8118496 
7331445 
8131616 



7249190 
7249190 
7249190 
6730720 



17982-18115,20297-20456 



70407-70554,71060-71160 
22335-23166 

172961-173056,173868-173928 

87839-88028 

144144-144329 



170688-170834 
96484-96681 

118596-118816,119119-119244,119609-119761,120422-120990,130161-130381,130468-130593,131097-131258,131866- 

131932,132451-132575,133580-134011 

83126-83250,85320-85540,94719-95287 

28397-28617,28920-29045,29135-292 8411 9567 ,'-05-29787,30224-30573 



165776-165996,166189-166314,166406-166569,167112-167268,157387-167469,168634-168942 

6973-7118 

124054-124209 

61542-61750 

4290443124,43211-43336,44607-44763,4519945281,46337-46732 

121907-122035,122804-122921,124019-124161,124455-124610,125672-126075 

113755-113910,115653-115765,116808-116940 

21059-21168 



110325-110491 



169 



WO 02/086443 

402420 9796339 Plus 
402674 8077108 Minus 



PCT/US02/12476 



403137 
403306 
403329 
403381 
403478 



404076 
404101 
404140 



9211494 Minus 

8099945 Plus 

8516120 Plus 

9438267 Minus 



404253 
404287 
404298 



404721 
404794 
404854 
404877 
404927 
404996 
405449 



9926489 Minus 

4572584 Minus 

5006246 Plus 

9357202 Minus 

2326514 Plus 

9944263 Minus 

9838195 Plus 

7528051 Plus 

9856648 Minus 

4826439 Plus 

7143420 Plus 

1519284 Plus 

7342002 Plus 



405572 3800891 Plus 
405646 4914350 Plus 
405676 4557087 Plus 



405770 
405932 
406137 



7767812 
9166422 
9256107 



127100-127251 
96450-96598 
26009-26178 



2888-3001,3198-3532,3655-4117 

23868-24342 

85126-85292 



3848-3967 

125742-125997 

37761-38147 



129171-129327 

169926-170121 

55675-56055 

53134-53281 

73591-73723 



80430-31581 

173763-174294 

101619-101898 

14260-14537 

1095-2107 



37999-38145,38652-38998,39727-39872,40557-40674,42351-42450 

4223642570 

35912-36065 



406467 9795551 Plus 



30487-31058 
7513-7673 
63448-63554 
182212-182958 



TABLE 10A: Potential Therapeutic, Diagnostic and Prognostic targets for Therapy of Lung Cancer and Non-malignant Lung Disease 

Table 7A shows about 307 genes up-regulated in non-malignant Sung disease relative to lung tumors and normal body tissues and/or down-regulated in lung tun 

normal lung and non-malignant lung disease. These genes'were selected from about 59680 probesets on the Eos/Affymetrix Hu03 Genechip array. 

Table 108 show the accession numbers for those Pkey's lacking Unigenel D's for table 1 0A, For each probeset we have listed the gene cluster number from wh 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from Genbank ESTs and mRNAs. These sequences were clustered b; 
Tin ng Clustenng and Alignment Tools (DoubleTwist, Oakland California) The Genbank accession numbers for sequences comprising exh cluster ar< 



Pkey:, Unique Eos, 

ExAccn: Exemplar Accession number, Genbank ; 
UnigenelD: Unigene number 
Unigene Title: Unigene gene title 

R1; Average of lung tumors (including squamous cell carol 

average of normal lung samples 



as, small cell carcinomas, granulomatous and carcinoid tumors) divided by th( 



R2: 



Average of non-malignant lung disease samples (including bronchitis, emphysema, fibrosis, atelectasis, asthma) divided by the average of normal lung samples 



Pkey 




UnigenelD 


Unigene Title 


R1 


R2 


404394 






ENSP00000241075TRRAP PROTEIN. 


0.79 


3.10 


404916 






Target Exon 


1.00 


159.00 


405257 






Target Exon 


1.00 


422.00 


407228 


M25079 


Hs.155376 


hemoglobin, beta 


0.47 


2.33 


407568 


AA740964 


Hs.62699 


ESTs 


1.00 


123.00 


408562 


AI436323 


Hs.31141 


Homo sapiens mRNA for KIAA1568 protein, 


1,00 


230.00 


409031 


AA376836 


Hs.76728 


ESTs 


1.00 


128.00 


410434 


AF051152 


Hs.63658 


toll-like receptor 2 


39.65 


149.00 


410467 


AF1 02546 


Hs.63931 


dachshund (Drasophila) homolog 


1.00 


109.00 


410808 


T40326 


Hs.167793 


ESTs 


1.14 




412351 


AL135960 


Hs.73828 


T-cell acute lymphocytic leukemia 1 


0.37 


2.27 


412372 


R85998 


Hs.285243 


hypothetical protein FLJ22029 


1.00 


173.00 


413795 


AL040178 


Hs.142003 


ESTs 


0.10 


11.90 


414154 


AW205314 


Hs.323060 


ESTs 


0.62 


2.09 


414214 


D49958 


Hs.75819 


glycoprotein M6A 


0.03 


4.55 


414998 


NM_002543 


Hs.77729 


oxidised low density lipoprotein (lectin 


0.64 


2.97 


415122 


D50708 


lls.22245 


ESTs 


0.07 


8.97 


415765 


NMJM5424 


Hs.78824 


tyrosine kinase with immunoglobulin and 


0.67 


1.65 


415775 


H00747 


Hs.29792 


L Ts i I -milartol38022hypothet> 


0.29 


2.64 


415910 


U20350 


HS.7B913 


chemokine (C-X3-C) receptor 1 


1.00 


146.00 



170 



WO 02/086443 



30 
35 
40 
45 
50 
55 
60 





AI815601 


Hs.79197 


CD83 antigen (activated B lymphocytes, ! 


15.32 


416402 


NM 000715 
D13168 


Hs.1012 




0,64 


417355 


Hs,82002 


endothelin receptor type B 


0,01 


417421 


AL1 38201 


Hs.82120 


nuclear receptor subfamily 4, group A, m 


35,30 


417511 


AL049176 


Hs.82223 


chordin-like 


1,00 


418489 


U76421 


Hs.85302 


adenosine deaminase, RNA-specific, B1 (h 


0,02 


418726 


BE241812 


Hs.87860 


protein tyrosine phosphatase, non-recept 


1,00 


418741 


H83265 


Hs.8881 


ESTs, Weakly similar to S41 044 chromosom 


0,44 


418883 


BE387036 


Hs.1211 


acid phosphatase 5, tartrate resistant 


0.96 


419086 


NM 000216 


Hs.89591 


Kallmann syndrome 1 sequence 


0,62 


419150 


T29618 


Hs.89640 


TEK tyrosine kinase, endothelial (venous 


0,03 


419235 


AW470411 


Hs.288433 


neurotrimin 


1,48 


419407 


AW410377 




hypothetical protein FLJ21276 


37,55 


420556 


AA278300 


Hsj 24292 


Homo sapiens oDNA: FU23123 (is, clone L 


0,60 


420656 


AA279098 


Hs,1 87636 


ESTs 


1.65 


420729 


AW964897 


Hs.290825 


ESTs 


2.99 


421177 


AW070211 


Hs.102415 


Homosapiens mRNA; cDNA DKFZp586M01 21 (f 


0.46 


422060 


R20893 


Hs.325823 


ESTs, Moderately similar to ALU5 HUMAN A 


1.00 




W79117 


Hs.58559 


ESTs 


0,03 


422652 


AW957969 


Hs.118958 


syntaxin 1 1 


0.14 


423099 


NM 002837 


Hs.123641 


protein tyrosine phosphatase, receptor t 


0.01 


424433 


H04607 


Hs.9218 


ESTs 


0.75 


424585 


AA464840 


Hs.131987 


ESTs 


1,00 


424711 


NMJ05795 


Hs.152175 


calcitonin receptor-ike 


0.43 


424973 


X92521 


Hs.1 54057 




0,37 


425023 


AW956889 


Hs.154210 


endothelial differentiation, sphingolipi 


0,14 


425664 


AJ006276 


H5.1590Q3 




1,00 


425998 


AU076629 


Hs.165950 


fibroblast growth factor receptor 4 


0,68 


426657 


NM 015865 


Hs.171731 


solute carrier family 1 4 (urea transport 


0,03 


426753 


T89832 


Hs.170278 


ESTs 


1.00 






Hs.2171 


growth differentiation factor 10 


1,00 


427983 


M17706 


Hs.2233 


col ny s „i>u'n!ing factor 3 (granulocyte 


0.75 




AK002121 




hypothetical protein FLJ1 1259 


0.76 




AA441837 




ESTs 


0.01 


429496 


AA453800 


Hs.192793 


ESTs 


1.00 




NM_004673 


Hs.241519 


angiopoietln-like 1 


1.00 




BE1 78536 


Hs.11090 


membrane-spanning 4-domains, subfamily A 


1.00 


431728 


NM 007351 


Hs.268107 


multimerin 


1.00 


431848 


AI378857 


Hs.126758 


ESTs, Highly similar to AF175283 1 zinc 


0.34 


432128 


AA127221 




ESTs 


0.00 


432519 


AI221311 


Hs.1 30704 


ESTs, Weakly similar to BCHUIA S-100 pro 


0.01 


433043 






lymphoid nuclear protein (LAF-4) mRNA 


1.00 




AI823593 


Hi27688 


ESTs 


1.00 




AA644669 


Hs.193042 


ESTs 


1.05 




AW972330 


Hs.283022 


triggering receptor expressed on myeloid 


0.83 


436532 


AA721522 




gb:nv54h12.r1 NCI CGAP_Ew1 Homosapiens 


1.00 


437119 


A1379921 


Hs.177043 


ESTs 


1.00 


437140 


AA312799 


Hs.283689 


activator of CREM in testis 


0.67 


437211 


AA382207 


Hs.5509 


ocotropic viral intcqration site 28 


1.00 


437960 


AI669586 


Hs.222194 


ESTs 


1.00 


438202 


AW1 69287 


Hs.22588 


ESTs 


1.00 


438873 


AI302471 


Hs.124292 


Homo sapiens cDNA: FLJ23123fis, clone L 


0.71 


438875 


AA827640 


Hs.189059 


ESTs 


23.32 


441048 


AA913488 


Hs.192102 


ESTs 


0.77 


441183 


AW292830 


Hs.255609 


ESTs 


3.43 


441499 


AW298235 


Hs.101689 


ESTs 


1.00 


444513 


AL120214 




glutamale receptor, ionotropic, AMPA 1 


1.00 


444527 


NM 005408 


H5.11383 


small inducible cytokine subfamily A (Cy 


46.47 


444561 


NM_004469 


Hs.11392 


c-fos induced growth faclor (vascular en 


0.01 


445279 


R41900 


Hs.22245 


ESTs 


0.60 


446017 


N98238 


Hs.55185 


ESTs 


0.18 


446984 


AB020722 


Hs.16714 


Rho guanine exchange factor (GEF) 15 


0.10 


446998 


N99013 


Hs.16762 


Homo sapiens mRNA; cDNA DKFZp564B2062 (f 






AI375922 


Hs. 159367 


ESTs 


0.46 


448106 


AI800470 


Hs.171941 


ESTs 


18.05 


448253 


H25899 


Hs.201591 


ESTs 


1.00 


449275 


AW450848 


Hs.205457 


periaxin 


0.56 




AI694722 


Hs.279744 


ESTs 


0.88 




AI654223 


Hs.1 6026 


hypothetical protein FU23191 


0.52 


450726 


AW204600 


Hs.250505 


retinoic acid receptor, alpha 


0.79 




H83294 


Hs.284122 


Wnt inhibitory factor-1 


0.35 


451533 


NMJ04657 


Hs.26530 


serum deprivation response (phosphatidyl 


0.13 


453636 


Hs.169872 


ESTs 


1.00 


458332 


AI000341 


Hs.220491 


ESTs 


1.00 




AA022888 


Hs.1 76065 


ESTs 


0.20 


400269 






Eos Control 


0.40 








NM_016369*:Homo sapiens claudin 1 8 (CLDN 


0.53 


407570 


Z19002 


Hs.37096 


nc fir iii upp 


0.01 


412295 


AW088826 


Hs.1 171 76 


poly(A)-binding protein, nuclear 1 




414517 


M24461 


Hs.76305 


surfactant, pulmonary-associated protein 


0^64 


417204 


N81037 


Hs.1074 


surfa 'ii pulmonary-associated protein 


0.33 


418307 


U70867 


Hs.83974 


i n fan fy 21 (prosl landir 


0.53 




T28499 


Hs.89485 


carbonic anhydrase IV 




421502 


AF1 11856 


Hs.1 05039 


solute carrier fan* 34 (sodium phosphs 


0.78 


421798 


N74880 


Hs.29877 


N-acylsphingos'me amidohydrolase (acid c 


0,59 



PCT/US02/12476 



133.00 
122.57 
142.00 
147.00 



171 



WO 02/086443 



25 
30 





AB011130 


Hs. 127436 


calcium channel, voltage-dependent, alph 


0.5S 


1,55 


423738 


AB002134 


H (2195 


airway trypsin-like protease 


10.14 


51.00 




M18667 




progastricsin (pepsinogen C) 


3,36 


1.52 






Hs.270840 




0.25 


9.45 


425828 


NMJ00020 


Hs. 172670 


activin A receptor type II -like 1 


0.03 






AA001732 


Hs '7, 233 


hypothetical protein FU10970 


0,01 


1.49 








uteroglobin 


0.42 


1.26 


430280 


AA361258 


Hs.237868 


interleukin 7 receptor 


0,46 


2.43 




X65018 


Hs.253495 


surfactant pulmonary-associated protein 


0,57 






AW058350 




no- j api n mRNA cDNA DKFZp564B2062 (f 


0.29 


1.80 




T92363 


Hs.178703 


ESTs 




2.27 




AB036432 




advanced glycosylation end product-speci 


o!ai 


1.51 




AW449467 






0.55 


1.78 




AI082692 


Hs. 134662 




0,00 


3.02 


444325 


AW152618 


Hs.16757 


ESTs 


0.32 


2.49 




AI904740 




receptor (calcitonin) activity modifying 


0.46 






NM.001089 


Hs!26630 


ATP-binding cassette, sub-family A (A3C1 


0.52 










soiule earner family 6 (neurotransmitte 


0,00 


3^30 




AF035528 


Hs. 153863 


MAD (mothers against decapentaplegic, Dr 


0.01 




444342 


NM_014398 


Hs. 10887 


similar to lysosome-assooiated membrane 


0.66 


2.20 








Target Exon 


1.00 


297.00 








C1 1001883':gi|6753278|re(|NP_033938.1 1 c 


1.00 


109.00 








NM_016582*:Homo sapiens peptide transpor 


0.89 




402474 






NM_004079.Homo sapiens cafhepsin S (CTSS 


1.45 


4^47 








ENSP00000235229:SEM8. 


1.00 


1.87 


403021 






C21000030:gi|9955960|ref|NP 063957.1| AT 


1.00 


149.00 








NfvL031419':Homo sapiens molecule possess 


1.06 


2.95 








MM 007037 - :Homo sapiens a disintegrin-li 


0.04 


4.89 


403764 






NfvL005463:Homo sapiens heterogeneous nuc 


1.00 


225.00 








NM_019111*:Homo sapiens major histoccmpa 


0.97 


1.93 








NM_002944*:Homo sapiens v-ros avian UR2 


1.00 


68.00 
1.83 




AI815601 




CD83 antigen (activated B lymphocytes, i 


0.02 








1100 1 01,503 1|re it 


1,00 


235.00 


405381 






Target Exon 


1.00 


93.00 








Target Exon 


1,37 


6.02 




M33600 




major histocompatibility complex, class 


0.86 


2.45 




AI219304 


Hs.266959 


hemoglobin, gamma G 


0.01 


3.19 




AA505665 


Hs.217493 


annexin A2 


1.00 


147.00 




M34996 


Hs. 198253 


major histocompatibility complex, class 


1.03 


2.04 




U82275 


Hs.94498 




1.00 










gb:Human trophoblast hypoxia-regulated f 


1.00 


90.00 




NM 000066 




complement component 8, beta polypeptide 


1.00 


67.00 




NM 001086 




arylacetamide deacctylasc (esterase) 


1.00 


102.00 


408045 


AW1 38959 


Hs.245123 


ESTs 


1.00 


70.00 








ESTs 


1.00 


112.00 


408374 


AW025430 


Hs.155591 


forkhead box F1 


0.07 


10.17 






Hs.141883 


ESTs 


0.39 


2.31 






Hs.673 


interleukin 1 2A (natural killer cell sti 


1.00 




409153 


W03754 


Hs.50813 


hypothetical protein FLJ20022 


0.01 


4.55 






Hs.687 


cytochrome P450, subfamily IVB, polypept 


0.01 


3.72 




AL049990 


Hs.51515 


Homo sapiens mRNA; cDNA DKFZp554G1 12 (fr 


1.00 


79.00 


409389 


AD007979 


Hs.301281 


Homo sapiens mRNA, chromosome 1 specific 


0.14 


27.35 




D86640 


Hs.56045 


sre homology three (SH3) and cysteine ri 


1.00 


113.00 




BE178622 


Hs.16291 


gb:PM3-HT0605-270200-001-a02 HT0605 Homo 


0.64 


2.47 




NMJ06770 


Hs.67726 


macrophage receptor with collagenous str 


0.55 


2.40 




BE160198 




gb:QV1 -HT041 3-01 02OO-059-hO3 HT0413 Homo 


1.00 


111.00 


412000 


AW576555 


Hs.15780 


ATP-binding cassette, sub-family A (ABC1 


1.00 


95.00 




BE047490 


Hs.24172 




1.00 


87.00 




AL035668 


Hs.73853 


bone morphogenetic protein 2 


1.43 


8.07 




X83703 


Hs.31432 


cardiac ankyrin repeat protein 


0.02 


3.07 


412B59 


AA290712 


Hs.82407 


CXCchemokineligand16 


0.93 


1.72 






Hs.82407 


CXCchernokineligand16 


0.97 


t.51 




U11874 


Hs.846 


interleukin 8 receptor, beta 


0.02 


2.42 




BE146973 




gb:QV4-HT0222-01 1 199-01 9-eOS HT0222 Homo 


0.65 


1.50 




BE157286 


Hs.20631 


zii finge p I ir u 1 > m y 1/ E Pe 


20.87 


232.00 




AA131466 


Hs.23767 


hypothetical protein FLJ12666 


1.00 


80.00 




AH 29238 






1.00 


85.00 


413802 


AW964490 




ESTs, Weakly similar to S65657 alpha-1C- 


1.00 


213.00 




NM_001872 


Hs.75572 


carboxypeptidase B2 (plasma) 


0.02 


3,93 




BE393856 


Hs.66915 


ESTs, Weakly similar to 16.7Kd protein [ 


1.00 


115.00 




AI056548 


Hs.72116 


hypothetical protein FLJ20992 similar to 


0.49 


1.94 


414700 


H63202 


Hs.38163 




0.03 


3.75 




AA311223 


Hs.283091 




0.86 


1.95 




N64464 


Hs.34950 




1.00 


120.00 




BE269352 




neutrophil cytosolic factor 2 (65kD, chr 
ESTs 


0.60 


2.48 




AA847758 


Hs!l 11030 


1.00 


95,00 




W92445 


Hs 165195 


Homo sapiens cDNA FLJ14237 (is, clone NT 


1.00 


136.00 


416030 


H15261 


Hs.21948 


ESTs 


0.02 


8.07 


416427 


BE244050 


Hs.79307 


Rac/Cdc42 guanine exchange factor (GEF) 


1.00 


73.00 


416464 


NM 000132 


Hs.79345 


coagulation factor VIII, procoagulantco 


0.70 


3.36 


416585 ' 


X54162 


Hs.79386 


leiomodinl (smooth muscle) 


0.06 


6.56 


416847 


L43821 


Hs.80261 


enhancer of fllamentation 1 (cas-likedo 


0.70 


3.63 




AA359896 


Hs.293885 


hypothetical protein FU14902 


1.00 


114.00 


417370 


T28651 


Hs.82030 




0.85 


1.30 


417673 


T87281 


Hs.16355 


ESTs 


0.15 


15.54 
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40 
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70 
75 



418067 


AI127958 


Hs.83393 


cystatin BM 


0.81 


1.74 




C01566 


Hs.86671 


ESTs 


1.00 


99.00 


418643 


J03798 


Hs.86948 


small nuclear ribonucleoprotein D1 polyp 


1.00 


60.00 


418832 


X04011 


Hs.88974 


cytochrome b-245, beta polypeptide (ohro 


2.40 


14.74 


418945 


BE246762 


Hs.89499 


araohidonate 5-lipoxygenase 


0.67 


' 3.16 


419261 


X07876 


Hs.89791 


I s-tyf h[,'T/ e j r 


1.00 


73.00 




U08989 




solute carrier family 1 (neuranal/epithe 


1.00 


192.00 


419968 


AK001989 
X04430 


Hs!93913 


hypothetical protein 

nl rleukii I interferon beta 2) 


1.00 
61.16 


94.00 
500.00 






Hs.76206 


cadherin 5, type 2, VE-cadherin (vascula 


0,52 




420235 


AA258124 


Hs.293878 


ESTs, Moderately similar to ZN91_HUMAN Z 


1.00 


172.00 




AA278436 


Hs.186649 




1.00 


97.00 




AA286746 


Hs.9343 


Homo sapiens cDNA FU14265 ffs, clone PL 


1.00 


64,00 


421445 


AA913059 


Hs. 104433 


Home sapiens, clone IMAGE:4054868, mRNA 


0.88 


1.51 








annexinA3 


0,05 


11.25 




Al 683243 


Hs.97258 


ESTs, Moderately similar to S29539 ribos 


1,00 


73.00 


421563 


NM_006433 


Hs.105806 


granulysin 


0,82 


2.42 




NM_000399 


Hs. 1 395 


early growth response 2 (Krox-20 (Drosop 


5,50 


31.57 




FD6504 




t,T Moderately nila ALU4 H ^ NA 


1,00 


129.00 


421913 


A1934365 


Hs!l09439 


osl ilyeii (c eoinductive fact mimi 


1,00 


101.00 




AA300900 


Hs.98849 


ESTs, Moderately similar lo AF161511 1 H 


0,60 


63.60 


422232 


D43945 


Hs.1 13274 


transcription factor EC 


1,00 


148.00 






Hs. 115830 


heparan sulfate (glucosamine) 3-O-sulfot 


1,40 


3.98 






Hs. 124940 


GTP-binding protein 


0,34 


3.59 




AK001866 


Hs. 1251 39 


hypothetical protein FU11004 


0,55 


2.00 


423387 


AJ012074 




Vr^ act e testinalpe i! ceplor 


0,09 


2.13 


423424 


AF150241 
AL110151 


Hs. 128433 


; rostagl ndin D2 synthase, hematopoietic 
DKFZP686D0824 protein 


10C 
1.00 


141.00 
66.00 








Sushi domain (SCR repeat) containing 


0,73 


1.27 




AW337575 




ESTs 


0,54 


2,58 




NM_005814 


Hs.143131 


glycoprotein A33 (transmembrane) 


0.77 


2.47 






Hs.1 26059 


ESTs 


1,00 


74.00 




AF020202 


Hs.155001 


UNC13(C.elegans)-like 




1.95 


426486 


BE178285 


Hs.159494 
Hs.170056 


Homo sapiens mRNA; cDNA DKFZp5B6B0220 (f 


lll8 
1.00 


76.00 




AF240467 




toll-like receptor 7 


1.00 


63.00 




NM_000760 


Hs.2175 


colony stimulating factor 3 receptor (gr 


0.60 


2.19 




NM 002980 




secretin receptor 


0.97 


1.42 




AA765368 


H&293941 


ESTs, Moderately similar to A53959 throm 


1.00 


105.00 




BE268717 


Hs.104916 


hypothetical protein FU21940 


1.00 


80.00 




AW207175 


Hs. 106771 


ESTs 


o.os 


2.55 


428780 


AI478578 
AI928355 


Hs. 50636 


ESTs 


1.00 
1.00 


98.00 
113.00 




D13626 




KIAA0001 gene product; putative G-protei 


1.00 


52.00 




AA469153 




gb:nc67f04.s1 NCI_CGAP_Pr1 Homo sapiens 


1.00 


132.00 




BE245562 




adrenergic, beta-2-, receptor, surface 


0.11 


15.60 




AW292053 




chromosome 1 open reading frame 21 
ESTs 


1.00 


103.00 


430414 


AW365665 


Hs!l20388 


0.50 


6.96 




AA482900 


Hs.162080 


ESTs 


1.00 


70.00 




AI734149 


Hs.1 19514 




1.00 


90.00 


430998 


AF128847 


Hs.204038 


indolethylamine N-methyltransferase 




1.84 




NM 013427 


Hs.250830 


Rho GTPase activating protein 6 


1,00 


79.00 




N46466 


Hs.58879 


ESTs 


0.91 


1,67 


432176 


AW090386 


Hs.1 12278 


arresfn, beta 1 


0.66 


2.63 




AA305746 




macrophage scavenger receptor 1 


1,00 


76.00 




AA339977 


Hs.274127 


CLST 11240 protein 


0.46 


1.46 


432485 


N90866 


Hs.276770 


CDW52 antigen (CAMPATH-1 antigen) 


0.79 


2.25 


432522 


D11466 


Hs.51 


phosphatidylinositol glycan, class A (pa 


1.93 


4.83 




AJ224741 


Hs.278461 




0.04 


5.79 










1.00 


167.00 




AB029496 


Hs.59729 


semaphorin sem2 


0.04 


9.15 




AI732637 


Hs.277901 




1.00 


91.00 


433588 


AI056872 


Hs.133386 




120.16 


315.00 


434445 


AI349306 
AW840171 


Hs.265398 


ESTs, Weakly similar to transformation-r 
Homo sapiens beta-1 adrenergic receptor 


0.60 
1.00 
1.00 


1.84 
128.00 
108.00 




AI248584 




Homo sapiens cDNA: FU21326 fis, clone C 


1.00 


91.00 




BE048860 






1.00 


87.00 








hypothetical protein FU12910 


1.00 


105.00 




AA370041 


HSJ456 


SWI/SNF related, matrix associated, acti 


1.00 


71.00 




H29796 


Hs.269622 


ESTs 


1.00 


115,00 


438199 


AW016531 


Hs.i/2147 


ESTs 


1.00 


80.00 


439551 


W72062 


Hs.11 112 




0.30 


3.10 






Hs.7239 


SEC24 (S. cerevisiae) related gene famil 
ESTs 


1.00 


77.00 




AI799488 


Hs.1 35905 


1.00 


85.00 




AA913880 


Hs.1 76379 


ESTs 


1.00 


82.00 


441384 


AA447849 


Hs.288660 


Homo sapiens cDNA: FU221 82 fis, clone H 
ESTs 


0.79 


1.89 




AI738675 


Hs.1 27346 


1.00 


75.00 


442200 
442832 
442957 


AW590572 
AW206560 
AI949952 


Hs.235768 
Hs.253569 


ESTs 
ESTs 
ESTs 


0.78 
0.03 
1.00 


5.83 
10.88 
70,00 


443282 


T47764 


Hs.1 32917 


ESTs 


1.00 


197.00 


443547 


AW271273 


Hs.23767 


hypothetical protein FU12666 


1.00 


253.00 


443951 
444330 


F13272 
AI597655 


Hs.1 11334 
Hs.49265 


ferritin, light polypeptide 
ESTs 


0.55 
1.00 


2.09 
90.00 
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AW204933 














Hs.23666 














Homo sapiens clone 24425 mRNA sequence 








BE397753 


Hs.14623 


interferon, gamma-inducible protein 30 








AI347863 












NM_006691 




extracellular link domain-containing 1 








AW958473 


























































450575 


NMJM5859 


Hs!29117 


purine-rich element binding protein A 


0.17 


11.33 




AA040403 












AW450461 














Hs.31 570 


ESTs, Weakly similar to K1AA1324 protein 








R52804 




























cartilage acidic protein 1 






452197 


AW023595 


HS732048 


ESTs 

purine-rich element binding protein A 


100 


67.00 






Hs.29191 


epithelial membrane protein 2 




















NM_016113 
AW295374 


Hs.279746 


vanilloid receptor-like protein 1 






453355 


Hs.31 412 


Homo sapiens cDNA FU11422 (is, clone HE 


1.00 


132.00 


453390 


AA862496 


Hs.28482 


ESTs 




72.00 


453531 


AA417940 
BE1 54396 




ESTs, Weakly similar to JC5795 CDEP prot 
gb;CM2-HT0342-091299-050-b05 HT0342 Homo 


1.00 
0.57 


68,00 
2.8S 


456579 


AA287827 


Hs.284205 


up-regulated by BCG-CWS 


1.00 


82.00 


45G672 


AK002016 


Hs.1 14727 


Homo sapiens, clone MGC: 16327, mRNA, com 


0.79 




457400 


AF032906 


Hs.252549 


calhepsin 2 


1.03 


3^25 


457718 
459696 


F18572 
F03027 


Hs.22978 


ESTs, Weakly similar to ALU4_HUMAN ALU S 
gb:HSC1 KA072 normalized infant brain cDN 


i"oo 


544^00 



430212 314437.1 

436532 421802J 

453531 97026J 

454741 1232559.1 



R20723 AA263003 M333976 AA33472S AA3341 51 AW965490 AA31 051 3 AI81 0530 D31302 AW1 34897 AA830127 AA046953 AI668930 
C06094AW1 04534 

BE1 60198 AW935898 T1 1 520 AW935930 AW856073 AW861 034 

BE146973 BE1 46972 BE147042 BE147018 BE145783 BE147020 BE146781 BE1 47019 BE146766 BE147021 BE145952BE146767BE147044 
BE146797 BE146776 BE146985 BE146793 BE145768 BE146771 BE146954 BE146760 BE147048 BE147025 3E147030 
AJ012074 U11087 L13288 X75299 L20295 AW6307B0 H14880T28037 AI872991 R72136 AW449839T81622 T79697 T29519 R94105T83923 
R73300 AI797007 R73390 AA961 01 0 H741 68 A1689932 BE045543 A180841 8 A160891 2 AI806573 AW884084 AW872978 AW872985 AA565655 
AI022915 R50647 R73210 H45098 R46451 AW166269 T7 1 132 A1264547 R52146 AI304920 R73391 AW864059 AW884085 H73241 T60038 
T79612 R73145 R50E49 AI094557 A1668793 R72302 A1564366 W01956 AA418962 W32571 R72840 H45409 R72085 R46356 R46758 
AA508805 AA418798 T83751 R94072 T1 61 82 AA9287B5 AA903896 

292546 M330586 A1570568 A W341487 A1827050 AW298668 AI792189 A1015S93 AI733599 AI572251 A! 672488 AW1 93262 AI244716 

A1864375 AI2061 00 AA912444 A1269365 Al 640254 AW772466 A1857336 AA627604 H 1 6914 AA358477 AA338009 

AA469153AI718503AA469225 

M721522AW975443 T93070 

M417940AA036735T07025 

BE154396 AW817959 BE154393 



Pkey: Unique numbercorresponding to an Eos probeset 

Ref: Sequence source. The 7 digit numbers in this column are Genbank Identifier (Gi) numbers. "Dunham I. et al." refers to the publication entitled "The DNA 

sequence of human chromosome 22" Dunham I. et al. Nature (1999) 402:489-495. 
Strand: Indicates DNA strand from which exons were predicted. 
Imposition: Indicates nucleotide positions of predicted exons. 



402808 
403021 
403421 
403438 
403687 
403764 
404277 
404288 
404394 
404518 
404916 
405106 



Ref 

7331445 
8117619 
3242744 
7547175 
6456148 
7547270 



7341826 
8079395 
7329310 



NLposition 



90044-90184,91111-91345 
33192-33360 

53526-53628,55755-55920,57530-57757 

114964-115136,115461-115585,115931-116047,117665-117771,118004-118102 
120799-120966 

126609-126773,139986-140205 

90792-90938 

9009-9534 

118692-118853 

91665-91946 

3512-3591 

37121-37205,37491-37762,41053-41140,41322-41593,41773-41919 

84494-84603 

91057-91188 

80877-81418 

73121-73273 

7636-8054 
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Table 1 1 B show the accession numbers for those Pkey's lacking UnigenelD's for table 1 1A. For each probeset we hare listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from Genbank ESTs and mRNAs. These sequences were clustered based on sequence 
similarity using Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers for sequences comprising each cluster are listed in the 
"Accession" column. 

\. For each predicted exon, we have listed the genomic 
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Pkey: 




i Eos probeset 


identifier number 










Jar Accession i 


number, Genbank accession number 






UnigenelD: Unigene number 








Unigene 


Title: Unigene gene title 










Average of lung tumo 


rs (including squamous cell carcinomas, adenocarcin 


omas, small cell carcinomas, t 






e of norma] lung samples 








Average of non-malignant lung disease samples (including bronchitis, emp 








ExAccn 


UnigenelD 


Unigene Title 










Target Exon 












NM_003122'.Homo sapiens serine protease 








M29540 


Hs.220529 


carcinoembryonic antigen-related cell ad 








A1827976 


Hs.24391 


hvi )th ica trotem FLJ13612 








AW072003 


Hs.40968 


heparan sulfate (glucosamine) 3-O-sulfot 








BE296227 


Hs.250822 


serine/threonine kinase 15 








AF251237 


Hs.1 12208 


XAGE-1 protein 








AF154830 


Hs.50966 


carbamoyl-phosphate synthetase 1, miloch 








AA5769E3 


Hs.22972 


hypothetical protein FU13352 








T05387 


Hs.7991 


ESTs 








AW248508 


Hs.279727 


Homo sapiens cDNA FU14035 fe, clone HE 








BE068889 




synuclein, gamma (breast cancer-specific 








L27943 


Hs.72924 


cytidine deaminase 








NM_000047 


Hs.74131 


arylsulfatase £ (chondrodysplasia puncta 








U11862 


Hs.75741 


amiloride binding protein 1 (amine oxida 








AW291168 


Hs.41295 


EST akly imi :MU H MAN MUCIN 








J04129 


Hs.82269 


progestagen-associated endometrial prate 








U60669 


Hs.89663 


cytochrome P450, subfamily XXIV (vitamin 








AU076704 




fibrinogen, A alpha polypeptide 








AW188117 


Hs.303154 


popeye protein 3 








AF044197 


Hs.1 00431 


small inducible cytokine B subfamily (Cy 








H87879 


Hs.102267 










U95031 


Hs.102482 


mucin 5, subtype B, tracheobronchial 








U76362 


Hs.104637 


soluto carrier family 1 (glutamate trans 








Y11339 


Hs.105352 


GalNAc alpha-2, 6-sialyltransferase 1, 1 








AI910275 




trefoil factor 1 (breast cancer, estroge 








U80736 


Hs.1 10826 


trinucleotide repeat containing 9 








AI868872 


Hs.282804 


hypothetical protein FLJ22704 








AF073515 


Hs.1 14948 


cytokine receptor-like factor 1 








L32137 


Hs.1584 


cartilage oligomeric matrix protein (pse 








AF041260 


Hs.1 29057 










M90516 


Hs.1674 


glutamine-fructose-6-phosphatetransamin 








AF242388 


Hs.149585 


lengsin 








M88700 


Hs.1 50403 


dopa decarboxylase (aromatic l-amino aci 








NM 002497 


Hs.1 53704 


NIMA (never in mitosis gene a)-related k 








BE245380 


Hs.1 53952 


5' nucleotidase (CD73) 








AB007948 


Hs.158244 


KIAA0479 protein 








AA367019 


Hs.241395 


protease, serine, 1 (trypsin 1) 


1.00 


83.00 




AA411101 


Hs.243886 


nuclear autoantigenic sperm protein (his 


7.41 


34.00 


428585 


AB007863 


Hs.185140 


KIAA0403 protein 


1.00 


6.00 


428758 


AA433988 


Hs.98502 


hypothetical protein FLJ 14303 


1.06 




429170 


NM.001394 


Hs.2359 


dual specificity phosphatase 4 


16.18 


105.00 


429263 


AA019004 


Hs.1 98396 


ATP-blnding cassette, sub-family A (ABC1 


1.07 


1.00 


429610 


AB024937 


Hs.211092 


LUNX protein; PLUNC (palate lung and nas 


1.59 


1.59 


430508 


AI015435 


Hs.104637 


ESTs 


4.75 


7.27 


43C985 


AA490232 


Hs.27323 


ESTs, Weakly similar to I78885 serlne/th 


0.94 


1.28 


431548 


AI834273 


Hs.9711 


novel protein 


5.66 


15.00 


431566 


AF176012 


Hs.260720 


J domain containing protein 1 


49.76 


37,00 


431986 


AA536130 


Hs.149018 


Novel human gene mapping to chomosome 20 


1.19 


1.47 


432375 


BE536069 


Hs.2962 


81 00 calcium-binding protein P 


1.65 


1.05 


432677 


NM.004482 


Hs.278611 


UDP-N-acetyl-alpha-D-galactosamine:polyp 


1.00 


46,00 


433556 


W56321 


Hs.1 11460 


calcium/calmodulin-dependent protein kin 


1.00 


1S.00 


433819 


AW511097 


Hs.1 12765 


ESTs 


3.71 


8.00 


434001 


AW950905 


Hs.3697 


serine (or cysteine) proteinase inhibito 


29,31 


72.00 


434424 


A1811202 


Hs.325335 


Homo sapiens cDNA: FLJ23523 fis, done L 
ESTs 


1.00 


64.00 


434792 


M649253 


Hs.132458 


8.52 


44,00 


436217 


T53925 


Hs.107 


fibrinogen-like 1 


57.97 


31.00 


436749 


AA584890 


Hs.5302 


lectin, galactoside-binding, soluble, 4 


1.10 




436972 


AA284679 


Hs.25640 




1,59 


1.46 


437866 


AA156781 




mebllothionein 1E (functional) 


3.62 


101.00 


437935 


AW939591 


Hs.5940 


mucin 13, epithelial transmembrane 




1,39 


438915 


AA280174 


Hs.285681 


Williams-Beuren syndrome chromosome regi 


1X» 


1.00 


439451 


AF086270 


Hs.278554 


heterochromatin-like protein 1 


23.28 


52.00 



ge of normal lung samples 
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AL359055 






















441377 


BE218239 


Hs202656 


ESTs 


22.03 


1.00 














443813 


AA876372 


Hs!93961 


Homo sapiens mRNA; cDNA DKFZp667D095 (fr 


UO 


1.99 




NM_002250 




rji' 1 jrn intcn 1 t mal i i<i i 






444670 


H58373 




hypothetical protein MGC5370 








AV652066 












AW1 68067 










446163 
446469 


AA026880 
BE094848 


Hs'.25252 


Homo sapiens cDNA FLJ13603 fis, clone PL 


1*00 


36,00 




AWB30534 




hypothetical protein FLJ20607 






448243 


AW369771 


Hs.52620 


integrin,beta8 


15.84 


l'.00 


448844 


AI581519 


Hs.177164 


ESTs 


1.00 


31.00 


449444 


AW818436 


Hs,23590 


solute earner family 16 (monocarboxylic 


1.00 


63.00 


451807 
452689 


W52854 
F33868 


Hs.284176 


hypothetical protein FLJ23293 similar to 


1.55 
1,54 


35,00 
1.44 


453392 


U23762 


Hs.32964 


SRY (sex determining region Y)-box 11 


1,00 


16,00 


453464 


A1884911 


Hs.32989 


receptor (calcitonin) activity modifying 
ESTs 


1.55 


2.45 


453735 


AI066629 


Hs.125073 


1.01 


1.30 



TABLE 11B 

Pkey: Unique Eos probeset identifier number 
CAT number: Gene cluster number 
Accession: Genbank accession numbers 



410399 11995 1 BE068889 BE068882 AF04431 1 AF017256 NM_003087 AF037207 AF010126 AA633976 AA872836 BE298825 BE299889 AI01 S464 AI684600 

AI936527 AA804675 M394097 AI1 39933 AA946606 BE171313 AA722407 AA293803 AI468480 AA056035 AA055968 AW796957 AI537713 
AA410737 H49348 AA486472 AA411094 AA235594 AA402624 AA443638 AW452137 AA421708 AW26521 1 AI493266 M365132 AW966044 

419502 18535 1 AU076704T74854 T74860 T72098 T73265 T73873 T69180 T74656 T58786 T60385 T73410 T68781 T67845 T67593 T73952 T67864 T60530 

T68367 T68401 T53959 T72360 T72099 T60377 T58961 T71712 T72821 T64738 T74645 T72037 T68688 T72063 T73258 T72826 T64242 
T68220 T74673 T71 800 T68355 T61 227 T62738 T6931 7 T53850 T64692 T73768 T73962 T73382 T68914 T70975 T73400 T60631 T73277 
T73203 T70498 T6 1 409 T58925 NMJC050B M54982 T5B301 T73729T69445T60424T67922T67736 T6871 6 T67755 T74765 T73819 T58719 
T74756 T60477 T74863 T61 109 T68329 T58850 T71857 T73425 T53736 T58607 T58898 T54309 T72031 T72079 T64305 T71908 T681 07 
T71916 T73787 T56035 T64425 T71870 T60476 T61376 T67820 T71895 T41006 T69441 T68170 T74617 T71958 T69440 T61875 R06796 
H48353 T719 14 T53939 T641 21 AA693995 T72525 T67779 T68078 AA01 1 465 AA345378 AV654847 AV654272 AV656001 AI064740 T82897 
N33594 AA344542 AW805054 A1207457 T61743 AA026737 H94389 AA382695 AA918409 T68044 S82092 T39959 AI017721 AA312395 
AA312919 T40156 H66239 AV652989 H38726 R98521 AV655200 R957 9 0 W03250 W00913 AA344136 AV660126 R97923 M343596 
AW470774 AV651256 N54417 AA812862 AW182929 AI111192 H61463 H72060 AA344503 H38639 AI277511 AV661108 AI207625 T47810 
AA235252 T27853 T47778 R95745 H70620 AA701463 AW827166 R98475 C20925 AV657287 T71959T71313T73920 T73333T61618T69293 
T69283 T73931 T72176 T72456 AV645539 AV65347B T72957 T723C0 T58906 T71457 T70494 T72956 T70495 T68267 T74407 T85778 
AA344726 T27854 T74485 T741 01 T73868 T71 51 8 T72304 M343853 T73909 T88070 T720B5 H72149 T73493 T73495 AV645993 R02293 
T70475 T64751 M344441 AA343657 AA345732 AA344328 Al 1 10639 AA344603 AF06351 3 T54696 T6851 5 T72223 T60507 T67633 R29500 
T72517 R02292 T60599T69206 T70452T74S77 R29366 T61277 T74914 TS0352 RZS675 T74843 AV645792 AA344408 T69197 T72057 
T69368T69358T68258AV650429 T73341 T61702T74598T40095 K02J72T40106AA343O45AA341908 AA341907 AA342B07AA341964 
T53747 T72042 T62764 AI064899 AA343060 T57832 T72440 T71770 T58091 T69108 T72449 T69167 T71289 T58251 AV654844 T64375 
AA345234 T67598 M01 1 41 4 T58036 H4B262 AI207557 T6821 9 W86031 T59081 T64232 R93196 T62136 AV650539 H67459 T72978 
AA344583T60362 H58121 T95711 T728D3 T68055T71715 R29036 T72793 T69122 T64595 T62B88 T691 39 T68291 T64652T67971 T46862 
AA693592 AI248502 R29454 T64764 T57001 T73052 T71429 T51176 T58866 AV655414 H90426 AA342489 T73666T67848 T72512 T53835 
T67837 T7331 7 T74273 T69420 T68245 T74380 T67B52 T74474 T56068 

421582 2041 1 AI910275 X00474 X52003 X05030 NMJJ03225 AA314326 AA308400 AA506787 AA314825 AI571948 AA507595 AA614579 AA587613 R83818 

AA56831 2 M614409 AA307578 AI925552 AW9501 55 A!91 3383 M1 2075 BE074052 AW004668 AA578674 AA582084 BE074053 BE0741 26 
BE074140 AA514776 AA588034 BE074051 BE074068 AW009769 AW050690 AA658276 R55389 AI001051 AW050700 AW750216 AA614539 
BE074045 AI307407 AW502303 BE073575 AI202532 AA524242 AI970639 AI909751 BE076078 AI909749 R55292 

437866 44433J AA156781 AW293839 U52054 M024963 AA778446 BE073977 AW444904 AW502574 BE164040 BE164012 BE163972 BE163974 BE163992 

AA837481 AW468444 BE185091 AW468002 AA687333AA81 1830 AA5B1806 AI866686 AI572124 AA043777 AA040926 D20160 AI53S733 
AA812489 AW874142AI471883 W84421 AA156850 

451807 8865 1 W52854 AL1 1 7600 BE208115 BE208432 BE206239 BE082291 AW953423 AA351B19 BE1 80648 BE140560 W60080 AA865478 N90291 

AW450652AW449519AA993634AI806539M351618AW449522AI827625AA9047eeAA3B0381 AA686045M774409 BE003229 Z41756 



TABLE 11C 

Pkey: Unique number corresponding to an Eos probeset 

Ref: Sequence source. The 7 digit numbers in this column are Genbank Identifier (G!) numbers. "Dunham I. et al." refers to the publication entitled "The DNA 

sequence of human chromosome 22." Dunham I. et al., Nature (1 999) 402:489 495. 

Strand: Indicates DNA strand from which exons were predicted. 

NLposition: Indicates nucleotide positions of predicted axons. 

Pkey Ref Strand Imposition 

403329 8516120 Plus 95450-96598 

406399 9256288 Minus 63448-63554 
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i-malignant lung disease, and normal lung. These genes 



Table 12B show the accession numbers for those Pkey's lacking UnigenelD's for table 1 2A. For each probeset we have listed the gene cluster nu 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from Genbank ESTs and mRNAs. These sequences we" 
similarity using Clustering and Alignment Tools (DoubleTwist, Oakland Ca'"— ~~ - - ; ^ " — 



m have listed the genomic 



Pkey: 


Unique Eos probeset identifier number 








Exemplar Accession number, Genbank accession number 






Unigene Title: Unigene gene title 








R1: 


Average of lung tumors (including squamous eel! carcinoma. 


las, small cell cr 






average of normal lung 








R2; 


Average of non-malignant lung disease samples (including bronchitis, emphysema, fibrosis, atelectasis, a: 








Unigene Title 






400289 






matrix metal!oproteinase 10 (stromolysin 






400566 






N" <J) r '< i- Hon ' apiens matrix metal pro 






401780 






NM_005557":Homo sapiens keratin 1 5 (foca 






401781 






Target Exon 






401785 






NM_002275*:Homo sapiens keratin 1 5 (KRT1 






401994 






Target Exon 






402075 






ENSP00000251056':Plasma membrane calcium 






404996 






Target Exon 






407839 




Hs.161566 








408000 






bullous pemphigoid antigen 1 (230/240kD) 






408522 


AI541214 




Small proline-rich protein SPRK (human, 






410561 


BE540255 




Homo sapiens cDNA: FU22044 fs, clone H 






415091 


AL044872 




3-hydroxy-3-methylglutaryl-Coenzyme A sy 








U88967 


Hs.78867 


protein tyrosine phosphatase, receptor-t 






416658 




Hs.79432 


fibrillin 2 (congenital contractural ara 






417034 


NM_006183 


Hs.80962 


neurotensin 








BE1 85289 




small proline-rich protein 1 B (comifin) 






418663 




Hs.41690 


desmocollin 3 






418678 


NM_001327 




cancer/testis antigen 






419121 


AA374372 




parathyroid hormone-like hormone 






420783 


AI659838 




lectin, galactoside-binding, soluble, 7 






421773 


W69233 








421948 












421978 


AJ243662 




NICE-1 protein 






422158 






protease inhibitor 3, skin-derived (SKAL 






422440 


NM_004812 


Hs.1 16724 


aldo-keto reductase family 1, member B10 






423634 


AW959908 




heparin-binding growth factor binding pr 






423725 


AJ 4031 08 




hypothetical protein LOC57822 






423738 


AB002134 




airway trypsin-like protease 






424012 


AW368377 




tumor protein 63 kDawifh strong homolog 






424046 


AF027866 


Hs! 138202 


serine (or cysteine] proteinase inhibito 


1 00 


1.0C 


424098 


AF077374 


Hs'.l 39322 


small proline-rich protein 3 


137.82 


54.00 


424834 


AK001432 


Hs.153408 


H i enscDNAFLJ105 0 fis, clone NT 


56.19 


12.00 


425650 


NM 001944 


Hs.1925 


desmogleln 3 (pemphigus vulgaris antigen 


33.45 


1.00 


427099 


AB032953 


Hs.1 73560 


odd Oz/ten-m homolog 2 (Drosophila, mous 


4.24 


17.0C 


427335 


AA448542 


Hs.251677 


G antigen 7B 


51.83 


4.00 


428182 


BE386042 


Hs.293317 


ESTs, Weakly s'milar to GGC1_HUMAN G ANT 


1.00 


1.C0 


428645 


AA431400 


Hs.98729 


ESTs.Wea I) milar to 201720 Adihydro 


1.00 


15.0C 


428748 


AW593206 


Hs.98785 


Ksp37 protein 


1.00 


87.00 




AA420450 


Hs.292911 


ESTs, Highly similar to S60712 band-5-pr 


2.01 


1.18 


429538 


BE182592 


Hs.1 1261 


small proline-rich protein 2A 


4.43 


Z90 


429903 


AL134197 


Hs.93597 


cyclin-dependent kinase 5, regulatory su 


11.80 


1.0C 


430486 


BE062109 


Hs.241551 


chloride channel, calcium activated, fam 


12.28 


41.00 


430890 


X54232 


Hs.2699 


glypican 1 


1.58 


1.40 


431009 


BE1 49762 


Hs.48956 


gap junction protein, beta 6 (connexin 3 


60.25 


28.CC 


431846 


BE019924 


Hs.271580 


uroplakin 1B 






433091 


Y12642 


Hs.3185 


lymphocyte antigen 6 complex, locus D 


120 


1.08 


434360 


AW015415 


Hs.127780 


ESTs 


40.98 


27.0C 


434880 


U02388 


Hs.101 


cytochrome P450, subfamily IVF, polypept 


1.00 


1.00 


435505 


AF200492 


Hs.21 1238 


lnterleukin-1 homolog 1 


1.00 


33.00 


435793 


AB037734 


Hs.4993 


KIAA1313 protein 


23.66 


42.00 


436511 


AA721252 


Hs.291502 


ESTs 


16.76 


14.00 


438403 


AA806607 


Hs.292206 


ESTs 


1,00 


1.00 


439285 


AL133916 




hypothetical protein FLJ20093 


4S.23 


139.00 


439605 


W79123 


Hs.58561 


G protein-coupled receptor 87 


33.61 


1.00 


439670 


AF088076 


Hs.59507 


ESTs, Weakly similar to AC004658 3 U1sm 


1.00 


1.00 


439705 


AW872527 


Hs.59761 


ESTs, Weakly similar to DAP1.HUMAN DEATH 


86.55 


11. OC 


440325 


NM 003812 


Hs.7164 




62.88 


147.C0 


441525 


AW241867 


Hs.127728 


ESTs 


1,53 


1,42 


443162 


T49951 


Hs.9029 


DKFZP434G032 protein 


31.11 


38.00 


444378 


R41339 


Hs.1 2569 


ESTs 


1.00 


1.00 



nd carcinoid tumors) divided by the 
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446292 AF081497 Hs.279682 Rh type C glycoprotein 1.55 1.26 

447078 AW885727 Hs.9914 ESTs 47.24 24.00 

447342 AI199268 Hs.19322 Homo sapiens, Similar to RIKEN cDNA 2010 28.63 1.00 

449003 X76342 Hs.389 alcohol dehydrogenase 7 (class IV), muo 1.00 1.00 

449101 AA205847 Hs.23016 

450832 AW970602 Hs.105421 ESTs 

452240 AI591147 Hs.61232 ESTs 

453317 NMJ02277 Hs.41696 keratin, hair, acidic, 1 

453830 AA534296 Hs.20953 ESTs 

454098 W27953 Hs.292911 ESTs, Highly similar to S60712 band-6-pr 

455601 AI368680 Hs.816 SRY (sex determining region Y)-box 2 

TABLE 12B 

Pkey. Unique Eos probeset identifier number 
CAT number: Gene cluster number 
Accession: Genbank accession numbers 



Pkey: Unique number corresponding to an Eos probeset 

Ref: Secuence source. The 7 digit numbers in this column are Genbank Identifier (Gl) numbers. "Dunham I. et al." refers to the publication entitled The DNA 

sequence of human chromosome 22." Dunham I. et al„ Nature (1999) 402:489-495. 

Strand: Indicates DNA strand from which exons were predicted. 
NLposition: Indicates nucleotide positions of predicted exons. 

Pkey Ref Strand NLposition 

400666 8118496 Plus 17982-18115,20297-20456 

401780 7249190 Minus 28397-28617,28920-29045,29135-29296,29411-29567,29705-29737,30224-30573 

401781 7249190 Minus 83215-33435,83531 636- 10-8 2 " 4393 84955-85037,86290-86814 

401785 7249190 Minus 165776-165996,166189-166314,165408-166569,167112-167268,167387-167469,168634-168942 

401994 4153858 Minus 42904-43124,43211-43336,44607-44763,45199-45281,48337-46732 

402075 8117407 Plus 121907-122035,122804-122921,124019-124161,124455-124610,125672-126076 

404996 6007890 Plus 37999-38145,38652-38998,39727-39872,40557-40674,42351-42450 
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TABLE 13A: Genes Distinguishing Non-Malignant Lung Disease from Lung Tumors and Normal lung 
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normal lung. These genes were selected from about 59680 probesets on 



Table 13B show the accession numbers for those Pkey's lacking UnigenelD's for table 1 3A. For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from Genbank ESTs and mRNAs. These sequences were clustered based on sequence 
sirr larity 3 Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers for sequences comprising each cluster are listed in the 
"Accession" column. 

ss in table 13A. For each predicted exon, we have listed the genomic 



Pkey: 




e Eos probeset identifier number 








Exemplar Accession number, Genbank accession number 






Unigene 


D: Unigene number 








Unigene Title: Unigene gene title 








R1: 


Average of lung tumors (including squamous ceil carcinomas, adenocar 


nomas, small cell carci 








e of normal 


ung samples 








Avera 


ge of non-malignant lung disease samples (including bronchitis, emphysema, f brosis, atele 






ExAccn 


UnigenelD 


Unigene Title 


R1 




408562 


AI436323 


Hs.31141 


Homo sapiens mRNA for KIAA1568 protein, 


1.00 




409031 


AA376836 


Hs.76728 


ESTs 




128.00 


412372 


R65998 


Hs.285243 


hypothetical protein FU22029 




173.00 




U20350 


Hs.78913 


chemokine (C-X3-C) receptor 1 


1,00 


145.C0 


417511 


AL049176 


Hs.82223 


chordin-like 


1.00 


179.00 


418819 


AA228776 


Hs.191721 


ESTs 






422060 


R20893 


Hs.325823 


ESTs, Moderately similar to ALU5_HUMAN A 








AA464840 


Hs.131987 


ESTs 


1.00 


167.00 


426753 


T89832 


Hs.170278 


ESTs 




141.00 


429496 


AA453800 


Hs.192793 


ESTs 


too 




430719 


AA488988 


Hs.293796 


ESTs 


1.00 




431089 


BE041395 




ESTs, Weakly similar to unknown protein 


23.32 


941 W 


431385 


BE178536 


Hs.11090 


membrane-spanning 4-domains, subfamily A 


1.00 


157.00 


431728 


NMJ07351 


Hs.268107 






157.00 


436532 


AA721522 




gb:nv54h12.r1 NCI_CGAP.Ew1 Homo sapiens 






437960 


AI669586 


Hs.222194 


ESTs 




147 '00 


438202 


AW1 69287 


Hs.22588 


ESTs 


1.00 




441499 


AW298235 


Hs.101689 


ESTs 


1.00 


167.00 


444513 


AL120214 


Hs.7117 


glutamate receptor, ionotropic, AMPA 1 


1.00 


151.00 


448253 


H25899 


Hs.201591 


ESTs 


1.00 


141.00 




R67837 


Hs.169872 


ESTs 




116.00 


458332 


AI000341 


Hs.220491 


ESTs 


1.00 


192.00 


459587 


AA031956 




gb:zk15e04.s1 SoaresjrecnanLuterus_NbH 


1.00 


154.00 


TABLE 1 


3B 












Unique E 


us probeset Identifier number 






C/tfnun 


ber: Gone cluster number 








Accessio 


n: Genbank 


accession ra 








Pkey 


CAT Number Accessi 








431089 


327825 


BE041395 AA491 826 AA621946 AA71 5980 AA5561 02 






436532 


421802, 


AA721522AW975443T93070 







natous and carcinoid tumors) divided by the 
divided by the averagi 



TABLE 13C 

Pkey: Unique number corresponding to an Eos probeset 

Ref: Sequence source. The 7 digit numbers In this column are Genbank Identifier (C-l) numb 

sequence of human chromosome 22." Dunham I. et al„ Nature (1 999) 402:489-495. 

Strand: Indicates DNA strand from which exons were predicted. 

NLposition: Indicates nucleotide positions of predicted exons. 

Pkey Ref Strand NLposition 

402075 8117407 Plus 121907-122035,122804-122921,124019-124161,124455-124610,125672-126076 



;. "Dunham I. etal." refers to the publication entitled The DNA 
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Table 14B show the accession numbers for those Pke/'s lacking UnigenelD's for table 14A. For each probeset we have listed the gene cluster 
o - ; „id3sweredesigned Gene clusters were compiled using sequences derived from Genbank ESTs sr.i mRNAs. These sequences 
- milari ■ 'lung Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers for sequences comprisini 
"Accession" column. 



ve have listed the genomic 



Exemplar Accession number, Genbank acci on nu 



UnigenelD: Unigene nu 
Unigene Title: Unigene ge 
Pref.Utllil Preferred Utility 
Pred.Loc: Predicted subcellular lor 







UnigenelD 


Unigene Title 


Pref Utility 


Pred. Loc 




Hs.2258 




mAb & diag & s.m. 


extracellular 




M242758 




LIV-1 protein, estrogen regulated 


mAb 


plasma membrane 








ENSP00000251056":Plasma membrane calcii 


jm mAb S diag 


secreted 




AW1 90902 


Hs.40098 


cysteine knot superfamily 1 , BMP antagon 


diag 






Y00787 




inlerleukin 8 


diag 


secreted 


408790 


AW580227 


Hs]47860 


neurotrophic tyrosine kinase, receptor, 


mAb & s.m. 


plasma membrane 


408908 


BE296227 


Hs.250822 


serine't'nreonine kinase 15 








AB033025 


Hs.50081 


Hypothetical protein, XP_051860 (KIAA1 19 


CTL & diag 




409103 


AF251237 


Hs.1 12208 


XAGE-1 protein 


CTL 


nuclear 


409420 






laminin, gamma 2 (nicein (100kD), kalini 




secreted 


409632 


W74001 


Hs.55279 


serine (or cysteine) proteinase inhibito 


diag 






NM 001398 


Hs.123114 


cystatin SN 


diag 


extracellular 




AW247090 




minichromosome maintenance deficient (S. 


CTL 






AW103364 






diag 


extracellular 


410001 


AB041036 


Hsl57771 


kellikreinU ' 


diag 


extracellular 








carbonic anhydrase IX 


mAb & s.m. 










transmembrane protease, serine 4 


mAb & diag & s.m. 






AA219691 




RAB6 interacting, kinesin-like (rabkines 








AW016610 


Hs.816 


ESTs 














diag 






AA926960 




CDC28 protein kinase 1 










Hs.295944 


tissue factor pathway inhibitor 2 








NMJ105025 




serine (or cysteine) proteinase inhibito 


mAb & diag & s.m. 










protein tyrosine phosphatase, receptor-t 


mAb & s.m. 


plasma membrane 








fibrillin 2 (congenital contractural ara 


diag 






NWL006183 




neurotensin 


diag 


cxtrxollular 






Hs.81134 


inlerleukin 1 receptor antagonist 


diag 


extracellular 






Hs.81892 


KIAA0101 gene product 




mitochondria! 


417389 


BE260964 


Hs.82045 


midkine (neurite growth-promoting factor 


mAb S diag 






BE270266 


Hs.82128 


5T4 oncofetal trophoblast glycoprotein 


mAb 


plasma membrane 






Hs,82952 


thymidylate synthetase 




endoplasmic reticulu 






Hs.1174 


cyclin-dependent kinase inhibitor 2A (me 




cytoplasm 


418506 


M084248 


Hs.85339 


G protein-coupled receptor 39 


mAb & s.m. 


plasma membrane 


418678 


NM_001327 


Hs.1 57379 


cane tcsti antigen (NY-ESO-1) 


CTL 


cytoplasmic 


419121 


AA374372 


Hs.89526 


parathyroid hormone-like hormone 


diag 


secreted 


419171 


NMJ02846 


Hs.89655 


protein tyrosine phosphatase, receptor t 


mAb & s.m. 


plasma membrane 


419183 


U60669 


Hs.89663 


cytochrome P450, subfamily XXIV (vitamin 


CTL & s.m. 


mitochondrial 


419216 


AU076718 


Hs.164021 
Hs.288433 


small inducible cytokine subfamily B(Cy 
neurotrimin 


diag 

mAb & diag 


plasma membrane 


419452 


U33635 


Hs.90572 


PTK7 protein tyrosine kinase 7 


mAb & s.m. 




419556 


U29615 


Hs.91093 


chillnese 1 (criitotriosidase) 


mAb & diag 


extracellular* 


420610 


AI683183 


Hs.99348 


distal-less homeo box 5 


CTL 




421110 


AJ250717 


Hs.1 355 


cathepsin E 




extracellular 




Y15221 


Hs.103982 


small inducible cytokine subfamily B (Cy 


diag 






U76362 


Hs.1 04637 


solute carrier family 1 (glutamate trans 




plasma membrane 


421552 


AF026692 


Hs.1 05700 


secreted frizzled-related protein 4 


diag 




421753 


BE314828 


Hs.1 07911 


ATP-bim n cassebe sub-fair y B MDF 


mAb & s.m. 


plasma membrane 


421817 


AF146074 


Hs.108660 


ATP-binding cassette, sub-family C (CFTR 




plasma membrane 


422109 


S73265 


Hs.1 473 


gastrin-releasing peptide 


diag 




422158 


L10343 


Hs.112341 


protease inhibitor 3, skin-derived (SKAL 




secreted 


422282 


AF019225 


Hs.1 14309 


apolipoprotein L 


diag 


secreted 


422283 


AW411307 


Hs.114311 


CDC45 (cell division cycle 45, S.cerevis 






422424 


AI186431 


Hs.295638 


prostate differentiation factor 




extracellular 


422765 


AW409701 


Hs.1578 


baculoviral IAP repeat-containing 5 (sur 




cytoplasm 


422809 


AK001379 


Hs.121028 


hypothetical protein FU10549 






422867 


L32137 


Hs.1 584 


cartilage oligomeric matrix protein (pse 


dfeg 




422956 


BE545072 


Hs.1 22579 


ECT2 protein (Epithelial cell transform'! 






423634 


AW959908 


Hs.1690 


heparin-binding growth factor binding pr 
matrix melalloproteinase 12 (macrophage 


diag 




423673 


BE003054 




mAb & diag & s.m. 




423961 


D13666 


Hs!l 36348 


periostin (OSF-2os) 


mAb & diag 


extracellular 


424046 


AF027866 


Hs.1 38202 


serine (or cysteine) proteinase inhibito 






424381 


AA285249 


Hs.1 46329 


protein kinase Chk2 


sm 
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424502 AF242388 Hs.149585 

424503 NM_002205 
424687 J05070 

425247 NMJ05940 Hs.155324 

425322 U63630 Hs. 155637 

425650 NNL001944 Hs.1925 

425734 AF056209 Hs.159396 

425776 U25128 Hs.1 59499 

425852 AK001504 Hs.159651 

426215 AW963419 Hs.1 55223 

M86699 Hs.1 69840 

BE616633 Hs.1 70195 

AA448542 Hs.251677 
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428450 NM.014791 

428479 Y00272 

428484 AF104032 

428664 AK001666 

428698 AA852773 

428748 AW593206 

428758 AA433988 

428969 AF120274 

429211 AF052693 

429263 AA019004 

429547 AW009166 

429610 AB024937 

429903 AL134197 

430486 BE062109 

431462 AW583672 

431515 NNL012152 

431846 BE019924 

431958 X63629 

432201 AI538613 

433001 AF217513 

435505 AF200492 

436481 AA379597 

437016 AU076916 

437044 AL035864 

437789 AI581344 



Hs.2250 
Hs.2256 
Hs.1 84339 
Hs.334562 
Hs.1 84601 
Hs.189095 
Hs.334838 
Hs.98785 



440006 AK000517 
441362 BE614410 
442117 AW664964 



Hs.5199 
Hs.5398 
Hs.69517 
Hs.1 27812 
Hs.256897 
Hs.250618 
Hs.58042 
Hs.58561 
Hs.9598 
Hs.6844 
Hs.23044 



444371 BE540274 
444381 BE387335 
444781 NMJJ14400 



Hs.288467 
Hs.25740 
Hs.326444 
Hs.28792 
Hs.29352 
Hs.61460 
Hs.30743 
Hs.62711 
Hs.127179 



protein kinase, DNA-activated, catalytic 
desmoglein 3 (pemphigus vulgaris antigen 
peptidylglycine alpha-amidating monooxyg 
parathyroid hormone receptor 2 
death receptor 6, TNF superfamily member 
stanniocalcin 2 
TTK protein kinase 

bone morphogenetic protein 7 (osteogenic 



leukemia inhibitory factor (cholinergic 



KIAA0175gene product 
dII * • si )n =lc 2, G1 to S and G2 to 
solute carrier family 7 (cationic amino 
similar to SALL1 (sal (DrosophilaHike 
KIAA1 866 protein 
Ksp37 protein 
CA125 antigen; mucin 16 



LUNX protein; PLUNC(palate lung and nas 
cyclin-dependent kinase 5, regulatory su 
chloride channel, calcium activated, fam 
granin-like neuroendocrine peptide precu 
endothelial differentiation, lysopfiospha 

cadherin 3, type 1, P-cadherin (placenta 
Transmembrane protease, serine 3 
clone HQ0310PRO0310p1 
r.terleukin-1 homo:og 1 
IISPC1 50 prater, similar to jbiqu tin-cor. 
guanine monphosphate synthetase 
dflerentially expressed in Fanconi'san 

ST u ii ii Ma I 17330 hypott t 
ESTs, Weakly similar to dJ365012.1 [H.sa 
UL16 binding protein 2 

ESTs, Moderately similar to GFR3_HUMAN G 
G protein-coupled receptor 87 
soma domain, immunoglobulin domain (Ig), 
NALP2 proteii PYRIN-Conteii ngAPAl 
RAD51 (S. cerevislae) homolog (E ccJi Re 
ESTs; hypothetical protein for IMAGB447 
c-Myc target JP01 



mAb&diag&s.m. extracellular 



mAb&diag&s.m. plasma membrane 



Hs.1 0086 type I 



Hs.157601 
Hs.19322 
Hs.52620 



446921 AB012113 

447033 AI357412 

447342 AI199268 

448243 AW369771 

448844 AI581519 

449048 Z45051 

449722 BE280074 

450001 NMJM1044 



452281 T93500 

452401 NM.007115 

452747 BE153855 

452838 U65011 

453968 AA847843 

457489 AI693815 



TABLE 14B 

Pkey: Unique Eos probeset identifier nu 
CAT number; Gene cluster number 
Accession: Genbank accession numbers 



ike-domain, multiple 6 
secreted phosphoprotein 1 (osteopontin, 
small inducible cytokine subfamily A (Cy 



similar to S68401 (cattle) glucose indue 
cyclln B1 

solute carrier family 6 (ns 
adisintegrin and me " . 
hypothetical protein XP.098151 (leucine- 
ER01 (S. oerovisiaej-like 
cartilage acidic protein 1 
Homo sapiens cDNA FLJ 1 1 041 fis, clone PL 
' ' '"r, alpha-induced pro 



LN1R 
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AA926960AA926959 W76521 W24270 W21525 AA037172 BE267636 H83136M469909 N86396 M001348 BE535736 M031745 BE566245 
AA082436 H72525 H77575 N49786 W80565 H78746 BE569035 W04339 R98127 T55938 BE279271 AW960304 T29812 AA476873 BE297387 
M292753 M177048 NM_001826 X54941 BE314366 M908783 AI719075 BE270172 BE269819 AA889955 AI204630 W25243 AI935150 
AA872039 W72395 T99630 AI422691 H98460 N31428 BE25591 5 H03265 A!857576 AA776920 AA910644 AA459522 AA293140 AW514667 
R75953 AW662396 AA652522 AI8651 47 A14231 53 AW252230 AA58441 0 AA5831 87 AW024595 AW069734 AI828996 AA282997 AA876046 
AW61 3002 M527373 AW972459 AI831 360 AA621337 M100926 AA77241 8 AA594628 AI033892 W95096 AI034317 AA398727 AI085031 
N95210 AI459432 AI041437 AA932124 AA627684 AA935329 AI004827 AI423513 AI094597 H42079 R54703 AI630359 AA517681 AA978045 
AA643280 W44561 A5991988 AI537692 AI090262 AA740817 AI312104 AI91 1822 AA416871 AI185409 AA1 29784 AA701 623 AI075239 
A1139549 AA633648 AI339996 AI336880 AA399239 AI078708 AI085351 AI352835 AI346618 AI146955 AI9893B0 A1348243 N92B92 AA765850 
AI494230 AI278887 AA962596 AI492600 W80435 AA001979 R97424 AI129015 N24127 AA157451 AA235549 AA459292 AA037114 AA129785 
A149421 1 AW059601 AW88671 0 R92790 N59755 AI351 128 AW589407 H47725 H97534 H48076 H48450T99631 AW300758 H03431 R75789 
AA954344 H77576 R96823 AI457100 N92845 N49682 H42033 BE220698 BE22071 5 H99552 AA701624 N74173 R64704 H79520 H72923 
H03266 BE261 919 AA769633 AA48031 0 AA507454 AA9 10586 AI203723 AW1 04725 W2561 1 W25071 T88980 H0351 3 T77589 R99156 
W95095 R97470 AA702275 T77551 AA9 1 1 952 H82956 NB3673 AA283672 

M009647 AA131254 M374293AW954405 H04410 AW6062B4 AA1 51 166 BE157467 BE157601 H04384W46291 AW663674 H04021 H01532 
AA190993 H03231 H59605 H 01642 AA852375M1 13758 M626915 AA746952 AI161014 AA099554 R69067 



Pkey: Unique number corresponding to an Eos probeset 

Ref: Sequence source. The 7 digit numbers in this column are Genbank Identifier (Gl) numbers. "Dunham 1. et al." refers to the publication entitled 'The DNA 

sequence of human chromosome 22." Dunham I. et al„ Nature (1S99) 402:489-495. 

Strand: Indi a s DNA i Mm which exons were predicted 

NLposition: Indicates nucleotide positions of predicted axons. 

Pkey Ref Strand NLposition 

402075 8117407 Plus 121907-122035,122804-122921,124019-124151,124455-124610,125672-126076 
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TABLE 15A: Information for all sequences in Table 16 

Table 15A shows the Seq ID No, Pkey, ExAoon, UnigenelD, and Unlgene Title for all of the sequences in Table 16. 
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sim , ng Clustering and AllgnmentTooIs (DoubleTwist, Oakland Califofna) Th nbanka ession numbers for sequences comprising each duster are listed in th; 
"Accession" column. 

Table 15C show the genomic positioning for those Pkey's lacking Unigene ID'S and accession numbers in table 15A. For each predicted exon, we have listed the genomic 
sequence source used for prediction. Nucleotide locations of each predicted exon are also listed. 



Seq ID No: Sequence ID number 

Pkey: Unique Eos probeset identifier number 

ExAccn: Exemplar Accession number, Genbank accession number 

UnigenelD: Unigene number 

Unigene TifeUnigene gene title 



Seq ID No: 

SeqlDNo:1&2 
Seq ID No: 344 
SeqlDNo:5 46 
Seq ID No: 7 48 
Seq ID No: 9 & 10 
Seq ID No: 11 & 12 
Seq ID No: 13 & 14 
Seq ID No: 15 416 
Seq ID No: 17 & 18 
Seq ID No: 19 & 20 
Seq ID No: 21 & 22 
Seq ID No: 234 24 
Seq ID No: 25 & 26 
Seq ID No: 27 & 28 
Seq ID No: 29 & 30 
Seq ID No: 3U32 
Seq ID No: 33 & 34 
Seq tD No: 35 & 36 
Seq ID No: 37 & 38 
Seq ID No: 39 & 40 
Seq ID No: 41 & 42 
Seq ID No: 43 & 44 
Seq ID No: 45 446 
Seq ID No: 47 4 48 
Seq ID No: 49 
Seq ID No: 50 4 51 

" o: 52 4 53 

o: 54 4 55 
o: 56 4 57 
o: 58 4 59 



410407 
412719 
417034 
430486 
407788 
407788 
407788 
407788 



eqIDI 



452838 
418663 
418663 



Seq ID No: 6 
Seq ID No: 62 4 63 
Seq ID No: 64 4 65 
Seq ID No: 66 4 67 
Seq ID No: 68 4 69 
Seq ID No: 70 4 71 
Seq ID No: 72 4 73 
Seq ID No: 74 4 75 
Seq ID No: 76 4 77 
Seq ID No: 78 4 79 
Seq ID No: 80 4 81 
Seq ID No: 82 4 83 
Seq ID No: 84 4 85 
Seq ID No: 86 4 87 
Seq ID No: 88 4 89 
Seq ID No: 90 4 91 
Seq ID No: 92 4 93 
Seq ID No: 94 4 95 
Seq ID No: 96 4 97 
Seq ID No: 98 4 99 
Seq ID No: 1004 101 
Seq ID No: 102 4 103 
Seq ID No: 104 4 105 
Seq ID No: 106 4 107 
Seq ID No: 108 4 109 
Seq ID No: 1104 111 
Seq ID No: 1124 113 
Seq ID No: 114& 115 
Seq ID No: 116 
Seq ID No: 117 a 118 
Seq ID No: 1194 120 
Seq ID No: 121 4 122 
Seq ID No: 1234 124 
Seq ID No: 125 4 126 



AW016610 
NM.006183 
BE062109 



BE514982 

AL133916 

U17760 

AW368377 

NM.001944 



412140 AA219691 



U65011 
AK001100 
AK001100 
W74001 



AI541214 
L10343 
AF200492 



406690 
431846 
418830 
424098 
443648 
311034 
408522 
422158 
435505 
417366 
431958 
441020 
423217 
429538 
448733 
444371 BE540274 
444371 BE540274 
444371 BE540274 
422168 AA586894 
422168 AA586894 
429259 AA420450 



429211 AF052693 

417389 BE260964 

423634 AW959908 

417515 L24203 

441362 BE614410 

425322 U63630 

449003 X76342 

431009 BE149762 

409103 AF251237 

417542 J04129 

428471 X57348 

418004 U37519 

414761 AU077228 

418203 X54942 

447343 AA256641 

437016 AU076916 

449230 BE613348 

446989 AK001898 

457319 AA057484 



Hs.63287 
Hs.816 
Hs.80962 
Hs.241551 



Hs.137569 

Hs.1925 

Hs.73625 

Hs.1695 

Hs.30743 

Hs.41690 

Hs.41690 

Hs.55279 

Hs.211092 

!ls.220529 

Hs.271580 

Hs.88959 

lls.139322 

Hs.143610 

Hs.311389 

Hs.46320 

Hs-1 12341 

Hs.2 11238 

Hs.1076 

Hs.2877 



Hs.239 
Hs.112408 
Hs.1 12408 
Hs.292911 



Hs.1 30881 
Hs.184601 
Hs.1 98249 
Hs.82045 
Hs.1 590 
Hs.82237 
Hs.23044 
Hs.155537 



Hs.184510 
Hs.87539 
Hs.77256 



ESTs 

neurotensin 
chloride channel, cal 
S100 calcium-binding protein A2 
S1 00 calcium-binding protein A2 
S100 calcium-binding protein A2 
S1 00 calcium-binding protein A2 
hypothetical protein FLJ20093 
laminin, beta 3 (nicein(125kD), kalinin 
lumor protein 63 kDa with strong homolog 
desmogleln 3 (pemphigus vulgaris antigen 
RAB6 interacting, kinesin-like (rabkines 



preferentially expressed antigen in mela 



ESTs, Highly similar to NKGDJHUMAN NKG2- 
Small proline-rich protein SPRK [human, 

interleukin-1 homolog 1 

small proline-rich protein 1B (comifin) 



collagen, type VII, alpha 
small proline-rich protein 
solute carrier family 6 (neurotransmitte 
forkheadboxMI 
forkhead box M1 
forkheadboxMI 

S1 00 calcium-binding protein A7 (psorias 
S10O calcium-binding protein A7 (psorias 
Rakophilin 

solute carrier family 2 (facilitated glu 
differentially expressed In Fanconi's an 
B-cell CLUlymphoma 11A (zinc finger pro 

gap junction protein, beta 5 (connexin 3 
midkine (neurits growth-promoting factor 
heparin-binding growth factor binding pr 



RAD51 (S. cerevisiae) homolog (Ecoii Re 
protein kinase, DNA-activated, catalytic 
alcoho' dehydrogenase 7 (class IV), mu o 
gap junction protein, beta 6 (connexin 3 
XAGE-1 protein 

progestagen-associated endometrial prote 
stratifm 

aldehyde dehydrogenase 3 family, member 

enhancer of zeste (Drosophila) homolog 2 

CDC28 protein kinase 2 

ESTs, High!/ similar to S02392 alpha-2-m 

guanine monphosphate synthetase 

melanoma cell adhesion molecule 

h p hell protein FU1103B 

ESTs, Highly similar to unnamed protein 

matrix metailoproteinase 9 (gelatinase B 
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Seq ID No: 13 
SeqID No: 133 & 134 
Seq ID No: 135 & 136 
SeqID No: 137 & 138 
Seq ID No: 139 & 140 
Seq ID No: 141 & 142 
SeqID No: 143 & 144 
Seq ID No: 145 & 146 
Seq ID No: 147 & 148 
SeqID No: 149 & 150 
Seq ID No: 151 & 152 
SeqID No: 1 53 & 154 
SeqID No: 1 55 & 156 
Seq ID No: 157 8 158 453884 
Seq ID No: 159 & 160 453884 
Seq ID No: 151 & 162 404877 
Seq ID No: 1638 164 413129 
Seq ID No: 165 8 166 413281 
Seq ID No: 167 8 158 444781 
Seq ID No: 159 8 170 416819 
Seq ID No: 171 8 172 451320 
Seq ID No; 173 8 174 418543 
Seq ID No: 175 8 176 
Seq ID No: 177 8 178 
SeqID No: 1798 130 
SeqID No: 181 8 182 
Seq ID No: 1338 184 
Seq ID No: 135 8 185 
Seq ID No: 187 8 188 
Seq ID No: 1898 190 
Seq ID No: 191 8 192 
Seq ID No: 1938 194 
Seq ID No: 1958 196 
Seq ID No: 197 8 198 
Seq ID No: 199 8 200 
Seq ID No: 201 8 202 
SeqID No: 203 8 204 
Seq ID No: 2058 206 101175 
Seq ID No: 207 8 208 429038 
Seq ID No: 209 8 210 418678 
Seq ID No: 211 8 212 418678 
Seq ID No: 2138 214 131927 
Seq IDNo: 2158 216 
Seq ID No: 217 8 218 
Seq ID No: 219 8 220 
Seq ID No: 221 8 222 
Seq ID No: 223 8 224 
Seq ID No: 225 8 226 
Seq ID No: 227 8 228 
Seq ID No: 229 8 230 
Seq ID No: 231 a 232 
Seq ID No: 233 
Seq ID No: 234 8 235 
Seq ID No: 236 8 237 
Seq ID No: 238 
Seq ID No: 239 8 240 
Seq ID No: 241 8 242 
Seq ID No: 243 8 244 
Seq ID No: 245 
Seq ID No: 246 8 247 
Seq ID No: 248 8 249 
Seq ID No: 250 8 251 
Seq ID No: 252 8 253 
Seq ID No: 254 8 255 
Seq ID No: 256 8 257 
Seq ID No: 258 8 259 
Seq ID No: 260 8 261 
Seq ID No: 262 8 263 
Seq ID No: 264 8 265 
Seq ID No: 266 8 267 
SeqID No: 268 8 269 
SeqID No: 270 8 271 
Seq ID No: 272 8 273 
Seq ID No: 274 8 275 
Seq ID No: 276 8 277 
Seq ID No: 278 8 279 
Seq ID No: 280 8 281 
Seq ID No: 282 
Seq ID No: 283 8 284 
Seq ID No: 285 8 286 
Seq ID No: 287 8 288 
Seq ID No: 289 8 290 
Seq ID No: 291 8 292 




ubiquitin cartel-terminal es 
Integrin,beta4 
CB44an«gen (homing fur 
RAN binding protein 1 
cyclin-dependent kinase inhibitor 2A (me 
cyclir.-dependent kinase inhibitor 2A (me 
cyclir.-dependent kinase inhibitor 2A (me 
( li epei lent kir t (me 
hypothetical protein FU10540 
baculovirat IAP repeat-containing 5 (sur 
HSPC150 protein sin" 
adisintegrinandmet 
G protein-coupled receptor 87 
K1AA0186 gene product 
KIAA0186 gene product 
KIAA01 86 gene product 
WAA0186 gene product 
NM_005365:Homo sap ens Tetanoma antigen 
RP42homolog 
transcription factor BMAL2 
GPI-anohored metastasis-associated prote 
pim-2 oncogene 

diacylglycerol kinase, zeta(104kD) 



dehydrogenase 3 family, men 
topoisomerase (DNA) II alpha (170kD) 
' tyrosine phosphatase, receotor- 
tyrosine phosphatase, 
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receptor-t 
receptor-t 
receptor-t 



KIAA0144geno product 
ATP-binding cassette, sub-family C (CFTR 
estrogen-responsive B box protein 
achaele-scute complex (Drosophila) homol 



oaicitonin-related polypeptide, beta 
calcito n/calc term-related polypepttd 

seizure related gene 5 (mouse)-iike 
cancer/iestis antigen (NY-ESO-1) 
cancer/lestis antigen (NY-ESO-1) 
oublecort se i , n in 
F T Veakly milar to GGC1_HI r< N G ANT 
G antigen 7B 

laminin, gamma 2 (r.icein (100kD), kalini 
ATPase, aminophospholipid transporter-li 
Human DNA sequence from clone RP5-850E9 
NM_021048:Homo sapiens melanoma antigen, 
serine (or cysteine) proteinase inhibito 
lysosomal 

Homo sapiens mRNA; cDNA DKFZp547C136 (fr 
Horn p r, NA FLJ1 1 lis I e NT 



guanine nucleotide binding protein (G pr 
ESTs 

cell division cycle 2, G1 to S and G2 to 
cell division cycle 2, G1toSandG2to 

Homo sapiens clone N11 NTera2D1 leratoca 
ESTs 

wingless-type MMTV integration site (ami 
DESC1 protein 

CDC45 (cell division cycle 45, Bxerevfe 
RAB38, member RAS oncogene family 
Rh type C glycoprotein 
MAD2 (miioti irest deficient, yeast, h 
budding uninhibited by benzimidazo'es 1 
serine (or cysteine) proteinase inhibito 
UL16 binding protein 2 



cell division cycle 2-like 1 (PITSLRE pr 



gb:ye53h05,s1 Soares fetal liver spleen 
hypothetical protein AF301222 
hypothctica' protein XP_098151 (leucine- 
NWJM2362:Homo sapiens melanoma antigen, 
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Seq ID No: 293 & 294 42462S 
SeqIDNo 



310&311 
312 & 313 
31 4 & 315 
31 6 & 317 
318 & 319 
320 & 321 



SeqIDNo: 
SeqIDNo: 
Seq ID No: 
Seq ID No: 
SeqIDNo: 
SeqIDNo: 
SeqIDNo: 
SeqIDNo: 
SeqIDNo: 
Seq ID No: 
SeqIDNo: 
SeqIDNo: 
SeqIDNo: 
SeqIDNo: 
Seq ID No: 324 & 325 
Seq ID No: 325 & 327 
Seq ID No: 328 & 329 
Seq ID No: 330 & 331 
Seq ID No: 332 & 333 
Seq ID No: 334 & 335 
Seq ID No: 336 & 337 
Seq ID No: 338 & 339 
Seq ID No: 340 & 341 
Seq ID No: 342 & 343 
Seq ID No: 344 & 345 
Seq ID No: 346 & 347 
Seq ID No: 348 & 349 
Seq ID No: 350 & 351 
SeqIDNo: 352 4353 
Seq ID No: 354 & 355 
Seq ID No: 356 & 357 
Seq ID No: 358 & 359 
Seq ID No: 360 & 361 
" iq ID No: 362 & 363 
iq ID No: 364 & 365 



299 & 300 437789 



453968 AA847843 
403478 

441525 AW241867 

434105 AW952124 

428810 AF068236 

413691 AB023173 

423934 U89995 

409228 R16811 

425734 AF056209 

413582 AW295647 

438403 AA806607 
403329 

409893 AW247090 

119073 BE245360 

113195 H83265 
102283 



102012 
105729 
134299 



330493 
417866 
418113 
437016 

Seq ID No: 366 & 367 429612 
Seq ID No: 368 & 369 440704 
Seq ID No: 370 4 371 431221 
Seq ID No: 372 & 373 431565 
Seq ID No: 374 4 375 431565 
Seq ID No: 376 & 377 132354 
SeqIDNo: 378 & 379 424441 
Seq ID No: 380 4 381 103768 
Seq ID No: 382 & 383 417512 
Seq ID No: 384 4 385 
Seq ID No: 386 & 387 
Seq ID No: 388 & 389 
Seq ID No: 390 & 391 
Seq ID No: 392 & 393 
Seq ID No: 394 4 395 
Seq ID No: 396 4 397 
Seq ID No: 398 & 399 
Seq ID No: 400 & 401 
Seq ID No: 402 & 403 
Seq ID No: 404 & 405 
Seq ID No: 406 & 407 
SeqIDNo: 408 & 409 
Seq ID No: 410 & 411 
Seq ID No: 412 & 413 
Seq ID No: 414 & 415 
Seq ID No: 416 & 417 
Seq ID No: 418 & 419 
Seq ID No: 420 & 421 
Seq ID No: 422 & 423 
Seq ID No: 424 & 425 
Seq ID No: 426 & 427 
Seq ID No: 428 & 429 
Seq ID No: 430 & 431 
Seq ID No: 432 4 433 
Seq ID No: 434 & 435 
Seq ID No: 436 & 437 
SeqIDNo: 438 4439 
SeqIDNo: 440 & 441 
SeqIDNo: 442 & 443 
SeqIDNo: 444 4445 
SeqIDNo: 446 4447 
Seq ID No: 448 4 449 
Seq ID No: 450 4 451 
Seq ID No: 452 4 453 
Seq ID No: 454 4455 
Seq ID No: 456 4 457 
Seq ID No: 458 4 459 
SeqIDNo: 460 4 461 



NM_005795 

U84722 

BE259035 

H46612 

AW580939 

AW016610 

L10343 

BE279383 

T19006 



409459 D86407 



Hs.151393 
Hs.127812 
Hs.127812 
Hs.127812 
Hs.127812 
Hs.127812 
Hs.62711 

Hs.127728 



Hs.75478 
Hs.159234 
Hs.22010 
Hs.159396 
H5.71331 
Hs.292206 

Hs.57101 
Hs.279477 
Hs.8881 
Hs.83381 



424503 
400289 
418007 
418007 
418738 
415138 
418506 
423961 
414812 
417433 
417433 
422867 
428227 
444381 
400303 



421552 
452747 
450375 
426215 
425247 
432201 
427585 
442117 
431211 
447033 
447033 
447033 
115522 
410418 

409041 
452461 
412420 
416658 
407811 



AW067903 

AI272141 

AU076916 

AF062649 

M69241 

M449015 

AF161470 

AF161470 



M13509 

M13509 

AW388633 

C18356 

AA084248 

D13666 

X72755 

BE270266 

BE270266 

L32137 

AA321549 

BE387335 

AA242758 

AF245505 

M852773 

W27249 



AI357412 
AI357412 
AI357412 



Hs.82772 
Hs.83484 
Hs.5398 
Hs.252587 



Hs.1076 

Hs.147097 

Hs.296398 



Hs.136348 

Hs.77367 

Hs.82128 

Hs.82128 

Hs.1584 

Hs.2248 

Hs.283713 

Hs-79136 

Hs.72157 

Hs.334838 

Hs.8109 

Hs.105700 

Hs.61460 



Hs.1 55324 
Hs.298241 
Hs. 179729 
Hs.1 28899 
Hs.323733 
Hs.1 57601 
Hs.1 57601 
Hs.1 57601 
Hs.333893 
Hs.63325 
Hs.50081 
Hs.50081 
Hs.108106 
Hs.73853 
Hs.79432 
Hs.40098 



glutamate-oysteir.e ligase, catalytic sub 
ESTs, Weakly similar to T17330 hypotheti 
ESTs, Weakly similar to T17330 hypotheti 
ESTs, Weakly similar to T17330 hypotheti 
ESTs, Weakly similar to T17330 hypotheti 
ESTs, Weakly similar to T17330 hypotheti 
High mobility croup (nonhistone chromoso 
NM 022342:Homo sapiens kinesin protein 9 
ESTs 



ATPase, Class VI, type 11B 
forkhead box El (thyroid transcription f 
ESTs, Weakly similar to 2109260A B cell 
peptidylglycine alpha-amidating monooxyg 
hypothetical protein M6CS360 
ESTs 

unnamed protein product [Homo sapiens] 
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cadherin 5, type 2, VE-cadherin (vascula 
singed (Drosophila)-like(sea urchin fas 
Homo sapiens HSPC285 mRNA, partial cds 
complement component C1q receptor 
ESTs 

protease inhibitor 3, skin-derived (SKAL 
plakophilin 3 

RAN, member RAS oncogene family 
parathyroid hormone-like hormone 
low density lipoprotein receptor-related 
endogenous retroviral protease 
collagen, type XI, alpha 1 
SRY (sex determining region Y)-box 4 
guanine monphosphale synthetase 



insulin-like growth factor binding prote 
SRB7 (suppressor of RNA polymerase B, yf 
butyrate-induced transcript 1 
butyrate-induced transcript 1 
small prcline-rich protein 1B (cornifin) 
H2Ahistone family, membBrX 
gb:Homo sapiens full length insert cDNA 
glycoprotein (transmembrane) nmb 
alpha-fetoprotein 

integrin, alpha 5 (fbronectin receptor, 
matrix metalloproteinase 10 (stromelysln 
matrix meteIloprotcina;c i ( nterstitial 
matrix metalloproteinase 1 (interstitial 
solute carrier family 7, (catlonic amino 
tissue factor pathway inhibitor 2 
G protein-coupled receptor 39 
periostin (OSF-2os) 
monokine induced by gamma interferon 
5T4 oncofetal trophoblast glycoprotein ■ 
5T4 oncofetal trophoblast glycoprotein 



small inducible cytokine subfamily B (Cy 
ESTs, Weakly similar to S64054 hypotheti 
LIV-1 protein, estrogen regulated 

KIAA1866 protein 
hypothetical protein FLJ21080 
secreted f zzM-related protein 4 



Transmembrane protease, serine 3 
collagen, type X, alpha 1 (Schmid metaph 
ESTs; hypothetical protein for IMAGE.447 
gap junction protein, beta 2, 26kD (conn 



bone morphogeretiL , teii ? 



cysteine knot superfamily 1, BMP anlagon 



186 



WO 02/086443 



PCT/US02/12476 




cartilage acidic protein 1 



death receptor 6, TNFsuperfamily member 



Predicted cation efflux pump 

C15000 ' 5 gi|3806122 gb|AAC69198.1| (AFO 

C150003C5:gi|3806122!gb|AAC69198.1|(AF0 



small inducible cytokine subfamily B (Cy 

inlerleukin 

glypican 1 
aquaporin4 
KIAA0B77 protein 

chifinase 3-like 1 (cartilage glycopiote 
lung type-l cell membrane-associated gly 
tumor necrosis factor, alpha-Induced pro 
tumor necrosis factor, alpha-induced pro 
solute carrier family 6 (neurotransmitte 
carbonic anhydrase IX 

gb:hd13d01.x1 Scares JIFL_T_GBC_S1 Homi 
ESTs 

chloride channel, calcium activated, fam 
laminin, beta 3 (nicein (125kD), kalinin 
desmogten 3 (pemphigus vulgaris antigen 
matrix metalloprotainase 12 (macrophage 
desmocollin 3 

LUNX protein; PLUNC (palate lung and nas 
carcinoembryonic antigen-related cell ad 
uroplaklnIB 

proteose inhibitor 3, skin-derived (SKAL 
cadherin 3, type 1, P-cadherin (placenta 
differentially expressed in Fanconfs an 
solute earner family 7 (cationic amino 
gap junction protein, beta 5 (connexin 3 



gap junction protein, beta 6 (connexin 3 
progestagen-associated endometrial prote 
melanoma ceil adhesion molecule 
a disintegrin and metalbproteinase doma 
a disintegrin and metalloproteinase doma 
matrix metalloproteinase 9 (geiatinase B 
integrin, beta 4 
hypoxia-inducible protein 2 ■ 
G protein-coupled receptor 87 
NM 005365:Homo sapiens melanoma antiger 
asis-associated prote 



hyalur. 



protein tyrosine phosphatase, re 
P'ote nt> osinepho phatasi ,r 
i I rrr'k*"' se 
protein ty sir losp a a e 
protein ty-csine phosphatase, re 
protein tyrosine phosphatase, re , 
ATP-binding cassette, sub-family C (CFTR 
cancer/testis antigen (NY-ESO-1) 
cancer/tostis antigen (NY-ESO-1) 
laminin, gamma 2 (nicein (100kD), kalini 

neurotrophi'c tyrosine kinase, receptor, 
neurotrophic tyrosine kinase, receptor, 
UL16 binding protein 2 
cystatin SN 
artemin 



hypothetical protein XP_098151 (le 
hypothetical protein XP.098151 (Ic 
pi minogena Jato uokina 



gb:Human nonspecific crossreacting anlig 
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Hs.2442 
Hs.1473 
Hs.288433 



similar to S68401 (cattle) glucose indue 
n I ndu t ytokme subfamily B(Cy 
granin-like neuroendocrine peptide precu 
integral, beta 8 




Unique Eos probeset identifier number 
3n Gene cluster number 
Genbank accession numbers 

CAT Number Accession 
AW341683 

33264 5 M27826 R7841 6 AA307645 AW957879 AW957800 AA633529 H 03662 

47065 1 AL133916 N791 13 AF086101 N76721 AW950823 AA364013 AW955684 AI346341 AI867454 N54734 AI655270 AI421279 AW0148B2 

AA775552 N62351 N59253 AA626243 A1341407 BE175639 AA45B958 AI358918 AA457077 
83327 1 AA009647 AA1 31 254 AA374293 AW954405 H0441 0 AW6062B4 AA1 51 1 66 BE1 57467 BE157601 H043B4 W46291 AW663S74 H04021 H01 532 

AA1 90993 H03231 H59605 H01642 AA852875 AA1 13758 AA626915 AA745952 AH 61 01 4 AA099554 R69067 
86576 1 AW1 18072 AI631 982 T1 5734 AA2241 95 AI701 458 W201 98 F26326 AA890570 N90552 AW071907 AI671 352 AI375892 T0351 7 R88265 

AI124088 AA224388 AI084316 AI3546B6 T33652 Al 140719 A172021 1 T03490 AI372337 T15415 AW2D583S AA6303B4 T03515 T33230 

AA01 71 31 AA443303 T33623 A1222556 T3351 1 T33785 AI41 9606 D5561 2 



TABLE 16C 

Pkey: Unique number corresponding to an Eos probeset 
Ref: Sequence source. The 7 digit numbers in this column are Genbank Identifier (G 

sequence of human chromosome 22.' Dunham L et al., Nature (19SS) 40248S 
Strand: Indicates DNA strand from which exons were predicted. 
NLposition: Indicates nucleotide positions of predicted exons. 



i numbers. "Dunham I. et al.' refers to the publication entitled The DNA 
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1 11 21 31 41 51 

I I I I I I 

GCCCGTACAC ACCGTGTGCT GGGACACCCC ACAGTCAGCC GCA7GGCTCC CCTGTGCCCC 60 

AGCCCCTGGC TCCCTCTGTT GATCCCGGCC CCTGCTCCAG GCCTCACTGT GCAACTGCTG 120 

CTGTCACTGC TGCTTCTGAT GCCTGTCCAT CCCCAGAGGT TGCCCCGGAT GCAGGAGGAT 180 

TCCCCCTTGG GAGGAGGCTC TTCTGGGGAA GATGACCCAC TGGGCGAGGA GGATCTGCCC 240 

AGTGAAGAGG ATTCACCCAG AGAGGAGGAT CCACCCGGAG AGGAGGATCT ACCTGGAGAG 3 00 

GAGGATCTAC CTGGAGAGGA GGATCTACCT GAAGTTAAGC CTAAATCAGA AGAAGAGGGC 3 SO 

T TAGAGGATCT ACCTACTGTT GAGGCTCCTG GAGATCCTCA AGAACCCCAG 420 

3GGACAA AGAAGGGGAT GACCAGAGTC ATTGGCGCTA TGGAGGCGAC 480 

CCGCCCTGGC CCCGGGTGTC CCCAGCCTGC GCGGGCCGCT TCCAGTCCCC GGTGGATATC E40 

CGCCCCCAGC TCGCCGCCTT CTGCCCGGCC CTGCGCCCCC TGGAACTCCT GGGCTTCCAG 600 

CTCCCGCCGC TCCCAGAACT GCGCCTGCGC AACAATGGCC ACAGTGTGCA ACTGACCCTG 660 

CCTCCTGGGC TAGAGATGGC TCTGGGTCCC GGGCGGGAG7 ACCGGGCTCT GCAGC7GCAT 72 0 

CTGCACTGGG GGGCTGCAGG TCGTCCGGGC TCGGAGCACA CTGTGGAAGG CCACCGTTTC 780 

CCTGCCGAGA TCCACGTGGT TCACCTCAGC ACCGCCTTTG CCAGAGTTGA CGAGGCCTTG 84 0 

GGGCGCCCGG GAGGCCTGGC CGTGTTGGCC GCCTTTCTGG AGGAGGGCCC GGAAGAAAAC 900 

AGTGCCTATG AGCAGTTGCT GTCTCGCTTG GAAGAAATCG CTGAGGAAGG CTCAGAGACT 960 

CAGGTCCCAG GACTGGACAT ATCTGCACTC CTGCCCTCTG ACTTCAGCCG CTACTTCCAA 102 0 

TATGAGGGGT CTOTGACTAC ACCGCCCTGT GCCCAGGGTG TCATCTGGAC TGTGTTTAAC 1080 

CAGACAGTGA TGCTGAGTGC TAAGCAGCTC CACACCCTCT CTGACACCCT GTGGGGACCT 1140 

GGTGACTCTC GGCTACAGCT GAACTTCCGA GCGACGCAGC CTTTGAATGG GCGAG7GATT 1200 

GAGGCCTCCT TCCCTGCTGG AGTGGACAGC AGTCCTCGGG CTGCTGAGCC AGTCCAGCTG 1260 

AATTCCTGCC TGGCTGCTGG TGACATCCTA GCCCTGGTTT TTGGCCTCCT TTTTGCTGTC 132 0 

ACCAGCGTCG CGTTCCTTGT GCAGATGAGA AGGCAGCACA GAAGGGGAAC CAAAGGGGGT 1380 

GTGAGCTACC GCCCAGCAGA GGTAGCCGAG ACTGGAGCCT AGAGGCTGGA TCTTGGAGAA 1440 

TGTGAGAAGC CAGCCAGAGG CATCTGAGGG GGAGCCGGTA ACTGTCCTGT CCTGCTCATT 1500 

ATGCCACTTC CTTTTAACTG CCAAGAAATT TTTTAAAATA AATATTTATA AT 

Seq ID NO: 2 Protein sequence: 

Protein Accession tt: NP_001207 

1 11 21 31 41 51 

I I I 1 I I 

MAPLCPSPWL PLLIPAPAPG LTVQLLLSLL LLMPVHPQRL PRMQEDSPLG GGSSGEDDPL 60 

GEEDLPSEED SPREEDPPGE EDLPGEEDLP GEEDLPEVKP KEEEECSLKL EDLPTVSAPG 12 0 

DPQEPQNNAH RDKEGDDQSH WRYGGDPPWP RVSPACAGRF QSPVDIRPQL AAFCPALRPL 180 

ELLGPQLPPL PELRLRNNGH SVQLTLPPGL EMALGPGREY RALQLHLHWG AAGRPGSEHT 240 

VEGHRFPAEI HWHLSTAFA RVDEALGRPG GLAVLAAFLE ECPEENSAYE QLLERLEEIA 300 

EEGSETQVPG LDISALLPSD FSRYFQYEGS LTTPPCAOGV IKTVFNQTVM LSAKQLHTLS 360 

DTLWGPGDSR LQLNFRATQP LNGRVIEASF PAGVDSSPKA AEPVQLNSCL AAGDILALVF 420 

GLLFAVTSVA FLVQMRRQHR RGTKGCVSYR PAEVAETGA 



AGCGGGGTTG TCTATTAACT TGTTCAAAAA GTATCAGGAG TTGTCAAGGC AGAGAAGAGA 
GTGTTTGCAA AAGGGGGAAA GTAGTTTGCT GCCTCTTTAA GACTAGGACT GAGAGAAAGA 
AGAGGAGAGA GAAAGAAAGG GAGAGAAGTT TGAGCCCCAG GCTTAAGCCT T~ 
TAATAATAAC AATCATCGGC GGCGGCAGGA TCGGCCAGAG GAGGAGG 
TGATCCTGAT TCCAGTTTGC CTCTCTCTTT TTTTCCCCCA AATTATTCTT C 

TCCTCGCGGA GCCCTGCGCT CCCGACACCC CCGCCCGCCT CCCCTCCTCC TCTCCCCCCG 360 

CCCGCGGGCC CCCCAAAGTC CCGGCCGGGC CGAGGGTCGG CGGCCGCCGG CGGGCCGGGC 420 

CCGCGCACAG CGCCCGCATG TACAACATGA TGGAGACGGA GCTGAAGCCG CCGGGCCCGC 4 80 

AGCAAACTTC GGGGGGCGGC GGCGGCAACT CCACCGCGGC GGCGGCCGGC GGCAACCAGA 540 

AAAACAGCCC GGACCGCGTC AAGCGGCCCA TGAATGCCTT CATGGTGTGG TCCCGCGGGC 600 

AGCGGCGCAA GATGGCCCAG GAGAACCCCA AGATGCACAA CTCGGAGATC AGCAAGCGCC 660 

TGGGCGCCGA GTGGAAACTT TTGTCGGAGA CGGAGAAGCG GCCGTTCATC GACGAGGCTA 72 0 

AGCGGCTGCG AGCGCTGCAC ATGAAGGAGC ACCCGGATTA IAAATACCGG CCCCGGCGGA 780 

AAACCAAGAC GCTCATGAAG AAGGATAAGT ACACGCTGCC CGGCGGGCTG CTGGCCCCCG 840 

GCGGCAATAG CATGGCGAGC GGGGTCGGGG TGGGCGCCGG CCTGGGCGCG GGCGTGAACC 90 0 

AGCGCATGGA CAGTTACGCG CACATGAACG GCTGGAGCAA CGGCAGCTAC AGCATGATGC 960 

AGGACCAGCT GGGCTACCCG CAGCACCCGG GCCTCAATGC GCACGGCGCA GCGCAGATGC 1020 

AGCCCATGCA CCGCTACGAC GTGAGCGCCC TGCAGTACAA CTCCATGACC AGCTCGCAGA 10 80 

CCTACATGAA CGGCTCGCCC ACCTACAGCA TGTCCTACTC GCAGCAGGGC ACCCCTGGCA 1140 

TGGCTCTTGG CTCCATGGGT TCGGTGGTCA AGTCCGAGGC CAGCTCCAGC CCCCCTGTGG 12 00 

TTACCTCTTC CTCCCACTCC AGGGCGCCCT GCCAGGCCGG GGACCTCCGG GACATGATCA 1260 

GCATGTATCT CCCCGGCGCC GAGGTGCCGG AACCCGCCGC CCCCAGCAGA CTTCACATGT 1320 

CCCAGCACTA CCAGAGCGGC CCGGTGCCCG GCACGGCCAT TAACGGCACA CTGCCCCTCT 1380 

CACACATGTG AGGGCCGGAC AGCGAACTGG AGGGGGGAGA AATTTTCAAA GAAAAACGAG 1440 

GGAAATGGGA GGGGTGCAAA AGAGGAGAGT AAGAAACAGC ATGGAGAAAA CCCGGTACGC 1500 

TCAAAAAAAA AAAAAAAAAA AAAATCCCAT CACCCACAGC AAATGACAGC TGCAAAAGAG 1560 

AACACCAATC CCATCCACAC TCACGCAAAA ACCGCGATGC CGACAAGAAA ACTTTTATGA 162 0 

GAGAGATCCT GGACTTCTTT TKGGGGGACT ATTTTTGTAC AGAGAAAACC TGGGGAGGGT 1680 

GGGGAGGGCG GGGGAATGGA CCTTGTATAG ATCTGGAGGA AAGAAAGCTA CGAAAAACTT 1740 

TTTAAAAGTT CTAGTGGTAC GGTAGGAGCT TTGCAGGAAG TTTGCAAAAG TCTTTACCAA 180 0 

TAATATTTAG AGCTAGTCTC CAAGCGACGA AAAAAATGTT TTAATATTTG CAAGCAACTT 1860 

TTGTACAGTA TTTATCGAGA TAAACATGGC AATCAAAA7G TCCATTGTTT ATAAGCTGAG 1920 
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AATTTGCCAA TATTTTTCAA GGAGAGGCTT CTTGCTGAAT TTTGATTCTG CAGCTGAAAT 1980 

TTAGGACAGT TGCAAACGTG AAAAGAAGAA AATTATTCAA ATTTGGACAT TTTAATTGTT 2040 

TAAAAATTGT ACAAAAGGAA AAAATTAGAA TAAGTACTGG CGAACCATCT CTGTGGTCTT 2100 

GTTTAAAAAG GGCAAAAGTT TTAGACTGTA CTAAATTTTA TAACTTACTG TTAAAAGCAA 2160 

T GCAGGTTGAC ACCGTTGGTA ATTTATAATA GCTTTTGTTC GATCCCAACT 2220 



GTTTGTAATA TTTCTGTAAA TTTATTGTGA TATTTTAAGG TTTTCCCCCC TTTATTTTCC 
GTAGTTGTAT TTTAAAAGAT TCGGCTCTGT ATTATTTGAA TCAGTCTGCC GAGAATCCAT 
GTATATATTT GAACTAATAT CATCCTTATA ACAGGTACAT TTTCAACTTA AGTTTTTACT 
CCATTATGCA CAGTTTGAGA TAAATAAATT TTTGAAATAT GGACACT 
AAAAAAACAA AACAAAAAAA CAAAAAACAA AAACAGAAAA AACAAAAAAA A 
CACAACACAA AAACAAAAAA AAAAAAAAGA AACAAACACA CAACACAACA CAACACAAAA 
CCACAACACA AACAACAACA CACAGAGGG 

Seq ID NO: 4 Protein sequence: 
Protein Accession #:CAA83435.1 



KKDKYTLPGG LLAPGGNSMA SGVGVGAGLG AGVNQRMDSY AHMNGW3NGS YSMMQDQLGY 
PQHPGLNAHG AAQMQPMHRY DVSALQYNSM TSSQTYMNGS PTYSMSYSQQ GTPGMALG3M 
GSWKSEASS SPPWTSSSH SRAPCQAGDL RDMISMYLPG AEVPEPAAPS RLHMSQHYQS 
GPVPGTAING TLPLSHM 

Seq ID NO: 5 DNA sequence 
Nucleic AcieJ Accession #: U91618 
Coding sequence: 29-541 

I 11 21 31 41 51 

I I I I I I 

CGGACTTGGC TTGTTAGAAG GCTGAAAGAT GATGGCAGGA ATGAAAATCC AGCTTGTATG 
CATGCTACTC CTGGCTTTCA GCTCCTGGAG TCTGTGCTCA GATTCAGAAG AGGAAATGAA 
AGCATTAGAA GCAGATTTCT TGACCAATAT GCATACATCA AAGATTAGTA AAGCACATGT 
TCCCTCTTGG AAGATGACTC TGCTAAATGT TTGCAGTCTT GTAAATAATT TGAACAGCCC 
AGCTGAGGAA ACAGGAGAAG TTCATGAAGA GGAGCTTGTT GCAAGAAGGA AACTTCCTAC 
TGCTTTAGAT GGCTTTAGCT TGGAAGCAAT GTTGACAATA TACCAGCTCC ACAAAATCTG 
TCACAGCAGG GCTTTTCAAC ACTGGGAGTT AATCCAGGAA GATATTCTTG ATACTGGAAA 
TGACAAAAAT GGAAAGGAAG AAGT CAT AAA GAGAAAAATT CCTTATATTC TGAAACGGCA 
GCTGTATGAG AATAAACCCA GAAGACCCTA CATACTCAAA AGAGATTCTT ACTATTACTG 
AGAGAATAAA TCATTTATTT ACATGTGATT GTGATTCATC ATCCCTTAAT TAAATATCAA 
ATTATATTTG TGTGAAAATG TGACAAACAC ACTTATCTGT CTCTTCTACA ATTGTGGTTT 
ATTGAATGTG TTTTTCTGCA CTAATAGAAA TTAGACTAAG TGTTTTCAAA TAAATCTAAA 
TCTTCAAAAA AAAAAAAAAA AAATGGGGCC GCAATT 



Seq ID NO: 6 Protein sequence: 
50 Protein Accession #: AAB50564 

1 11 21 31 41 51 

1 I I I I I 

MMAGMKIQLV CMLLLAFSSW SLCSDSEEEM KALEADFLTN MHTSKISKAH VPSWKMTLLK 60 

55 VCSLVNNLNS PAEETGEVHE EELVARRKLP TALDGFSLEA HLTIYQLHKI CHSRAFQHWE 12 0 
LIQEDILDTG NDKNGKEEVI KRKIPYILKR QLYENKPRRP YILKRDSYYY 

Seq ID NO: 7 DNA sequence 
Nucleic Acid Accession #: NM_006536.2 
60 Coding sequence: 109-2940 

1 11 21 31 41 51 

I I I I I 

ACCTAAAACC TTGCAAGTTC AGGAAGAAAC CATCTGCATC CAT AT TGAAA ACCTGACACA 60 

65 ATGTATGCAG CAGGCTCAGT GTGAGTGAAC TGGAGGCTTC TCTACAACAT GACCCAAAGG 12 0 

AGCATTGCAG GTCCTATTTG CAACCTGAAG TTTGTGACTC TCCTGGTTGC CTTAAGTTCA 160 

GAACTCCCAT TCCTGGGAGC TGGAGTACAG CTTCAAGACA ATGGGTATAA TGGATTGCTC 240 

ATTGCAATTA ATCCTCAGGT ACCTGAGAAT CAGAACCTCA TCTCAAACAT TAAGGAAATG 300 

ATAACTGAAG CTTCATTTTA CCTATTTAAT GCTACCAAGA GAAGAGTATT TTTCAGAAAT 360 

70 ATAAAGATTT TAATACCTGC CACATGGAAA GCTAATAATA ACAGCAAAAT AAAACAAGAA 42 0 

TCATATGAAA AGGCAAATGT CATAGTGACT GACTGGTATG GGGCACATGG AGATGATCCA 480 

TACACCCTAC AATACAGAGG GTGTGGAAAA GAGGGAAAAT ACATTCATTT CACACCTAAT 540 

TTCCTACTGA ATGATAACTT AACAGCTGGC TACGGATCAC GAGGCCGAGT GTTTGTCCAT 600 

GAATGGGCCC ACCTCCGTTG GGGTGTGTTC GATGAGTATA ACAATGACAA ACCTTTCTAC 660 

75 ATAAATGGGC AAAATCAAAT TAAAGTGACA AGGTGTTCAT CTGACATCAC AGGCATTTTT 720 

GTGTGTGAAA AAGGTCCTTG CCCCCAAGAA AACTGTATTA TTAGTAAGCT TTTTAAAGAA 780 

GGATGCACCT TTATCTACAA TAGCACCCAA AATGCAACTG CATCAATAAT GTTCATGCAA 840 

AGTTTATCTT CTGTGGTTGA ATTTTGTAAT GCAAGTACCC ACAACCAAGA AGCACCAAAC 900 

CTACAGAACC AGATGTGCAG CCTCAGAAGT GCATGGGATG TAATCACAGA CTCTGCTGAC 960 

80 TTTCACCACA GCTTTCCCAT GAATGGGACT GAGCTTCCAC CTCCTCCCAC ATTCTCGCTT 1020 

GTACAGGCTG GTGACAAAGT GGTCTGTTTA GTGCTGGATG TGTCCAGCAA GATGGCAGAG 1080 

GCTGACAGAC TCCTTCAACT ACAACAAGCC GCAGAATTTT ATTTGATGCA GATTGTTGAA 1140 

ATTCATACCT TCGTGGGCAT TGCCAGTTTC GACAG CAAAG GAGAGATCAG AGCCCAGCTA 1200 

CACCAAATTA ACAGCAATGA TGATCGAAAG TTGCTGGTTT CATATCTGCC CACCACTGTA 12 60 

85 TCAGCTAAAA CAGACATCAG CATTTGTTCA GGGCTTAAGA AAGGATTTGA GGTGGTTGAA 1320 

AAACTGAATG GAAAAGCTTA TGGCTCTGTG ATGATATTAG TGACCAGCGG AGATGATAAG 1380 

CTTCTTGGCA ATTGCTTACC CACTGTGCTC AGCAGTGGTT CAACAATTCA CTCCATTGCC 1440 
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CTGGGTTCAT CTGCAGCCCC AAATCTGGAG GAATTAT CAC GTCTTACAGG AGGTTTAAAG 1500 
TTCTTTGTTC CAGATATATC AAACTCCAAT AGCATGATTG ATGCTTTCAG TAGAATTTCC 1560 



AAACCTCACC ATCAATTGAA AAACACAGTG ACTGTGGATA ATACTGTGGG C 

ATGTTTCTAG TTACGTGGCA GGCCAGTGGT CCTCCTGAGA TTATATTATT TGATCCTGAT 1740 

GGACGAAAAT ACTACACAAA TAATTTTATC ACCAATCTAA CTTTTCGGAC AGCTAGTCTT 1800 

TGGATTCCAG GAACAGCTAA GCCTGGGCAC TGGACTTACA CCCTGAACAA TACCCATCAT 1860 

3 CCCTGAAAGT GACAGTGACC TCTCGCGCCT CCAACTCAGC TGTGCCCCCA 1920 

3 AAGCCTTTGT GGAAAGAGAC AGCCTCCATT TTCCTCATCC TGTGATGATT 1980 

TATGCCAATG TGAAACAGGG ATTTTATCCC ATTCTTAATG CCACTGTCAC TGCCACAGTT 2040 

GAGCCAGAGA CTGGAGATCC TGTTACGCTG AGACTCCTTG ATGATGGAGC AGGTGCTGAT 2100 

GTTATAAAAA ATGATGGAAT TTACTCGAGG TATTTTTTCT CCTTTGCTGC AAATGGTAGA 2160 
TATAGCTTGA AAGTGCATGT CAATCACTCT CCCAGCATAA G 
CCAGGGAGTC ATGCTATGTA TGTACCAGGT TACACAGCAA ACGGTAATAT T 

GCTCCAAGGA AATCAGTAGG CAGAAATGAG GAGGAGCGAA AGTGGGGC7T TAGCCGAGTC 2340 

AGCTCAGGAG GCTCCTTTTC AGTGCTGGGA GTTCCAGCTG GCCCCCACCC TGATGTGTTT 2400 

CCACCATGCA AAATTATTGA CCTGGAAGCT GTAAAAGTAG AAGAGGAATT GACCCTATCT 2460 

TGGACAGCAC CTGGAGAAGA CTTTGATCAG GGCCAGGCTA CAAGCTATGA AATAAGAATG 2520 

AGTAAAAGTC TACAGAATAT CCAAGATGAC TTTAACAATG CTATTTTAGT AAATACATCA 2580 

AAGCGAAATC CTCAGCAAGC TGGCATCAGG GAGATATTTA CGTTCTCACC CCAGATTTCC 2640 

ACGAATGGAC CTGAACATCA GCCAAATGGA GAAACACATG AAAGCCACAG AATTTATGTT 2700 

GCAATACGAG CAATGGATAG GAACTGCTTA CAGTCTGCTG TATCTAACAT TGCCCAGGCG 27 60 

CCTCTGTTTA TTCCCCCCAA TTCTGATCCT GTACCTGCCA GAGATTATCT TATATTGAAA 2 82 0 

GGAGTTTTAA CAGCAATGGG TTTGATAGGA ATCATTTGCC TTATTATAGT TGTGACACAT 2880 

CATACTTTAA GCAGGAAAAA GAGAGCAGAC AAGAAAGAGA ATGGAACAAA ATTATTATAA 2 940 

ATAAATATCC AAAGTGTCTT CCTTCTTAGA TATAAGACCC ATGGCCTTCG ACTACAAAAA 3000 

CATACTAACA AAGTCAAATT AACATCAAAA CTGTATTAAA ATGCATTGAG TTTTTGTACA 3060 

ATACAGATAA GATTTTTACA TGGTAGATCA ACAATTCTTT TTGGGGGTAG ATTAGAAAAC 312 0 

CCTTACACTT TGGCTATGAA CAAATAATAA AAATTATTCT TTAAAGTAAT GTCTTTAAAG 31B0 

GCAAAGGGAA GGGTAAAGTC GGACCAGTGT CAAGGAAAGT TTGTTTTATT GAGGTGGAAA 3240 

AATAGCCCCA AGCAGAGAAA AGGAGGGTAG GTCTGCATTA TAACTGTCTG TGTGAAGCAA 3300 

TCATTTAGTT ACTTTGATTA ATTTTTCTTT TCTCCTTATC TGTGCAGTAC AGGTTGCTTG 3 3 60 

TTTACATGAA GATCATGCTA TATTTTATAT ATGTAGCCCC TAATGCAAAG CTCTTTACCT 342 0 

CTTGCTATTT TGTTATATAT ATTTCAGATG ACATCTCCCT GCTAATGCTC AGAGATCTTT 3480 

TTTCACTGTA AGAGGTAACC T TT AACAAT A TGGGTATTAC CTTTGTCTCT TCATACCGGT 3540 

TTTATGACAA AGGTCTATTG AATTTATTTG TNTGTAAGTT TCTACTCCCA TCAAAGCAGC 3600 

TTTCTAAGTT TATTGCCTTG GGTTATTATG GAATGATAGT TATAGCCCCM TATAATGCCT 3660 
TACCTAGGAA A 



45 MTQRSIAGPI CNLKFVTLLV ALSSELPFLG AGVQLQDNGY NGLLIAINPQ V 

IKEMITEASF YLFNATKRRV FFRNIKILIP ATWKANNNSK I 
GDDPYTLQYR GGGKEGKYIH FTPNFLLNDN LTAGYGSRGR M 
KPFYINGQNQ IKVTRCSSDI TGIFVCEKGP CPQENCIISK LFKEGCTFIY NSTQNATASI 
MFMQSLSSW EFCNASTHNQ EAPNLQNQMC SLRSAWDVIT D" 

50 TFSLVQAGDK WCLVLDVSS KMAEADRLLQ LQQAAEFYLM QIVEIHTFVG I. 

RAQLHQINSN DDRKLLVSYL PTTVSAKTDI SICSGLKKGF SWEKLN'GKA YGSVMILVTS 
GDDKLLGNCL PTVLSSGSTI HSIALGSSAA PNLEELSRLT GGLKFFVPDI SNSNSMIDAF 
SRISSGTGDI FQQHIQLEST GENVKPHHQL KNTVTVDNTV GNDTMFLV7W QASGPPEIIL 
FDPDGRKYYT NNFITNLTFR TASLWIPGTA KPGHWTYTLN NTHHSLQALK VTVTSRASNS 

55 AVPPATVEAF VERDSLHFPH PVMIYANVKQ GFYPILNATV TATVEPETGD PVTLRLLDDG 
AGADVIKNDG IYSRYFFSFA ANGRYSLKVH WHSPSISTP AHSIPGSHAM YVPGYTANGN 
IQMKAPRKSV GRNEEERKWG FSRVSSGGSF SVLGVPAGPH PDVFPPCKII DLEAVKVEEE 
LTLSWTAPGE DFDQGQATSY EIRMSKSLQN IQDDFNNAIL VNTSKRNPQQ AGIREIFTFS 
PQISTNGPEH QPNGETHESH RIYVAIRAMD RNSLQSAVSN IAQAPLFIPP NSDPVPARDY 

60 LILKGVLTAM GLIGIICLII WTHHTLSRK KRADKKENGT KLL 

Seg ID NO : 9 DNA sequence 
Nucleic Acid Accession )t: Eos sequence 
65 Coding sequence: 336-632 

1 11 21 31 41 51 

I I I I 

CTCCCCTCAC CCCGGTCCAG GATGCCCAGT CCCCACGACA CCTCCCACTT CCCACTGTGG 
70 CCTGGGTGGG CTCAGGGGCT GCCCTTGACC TGGCCTAGAG CCCTCCCCCA GCTGG7GGTG 
GAGCTGGCAC TCTCTGGGAG GGAGGGGGCT GGGAGGGAAT GAG7GGGAAT GGCAAGAGGC 
CAGGGTTTGG 1GGGATCAGG TTGAGGCAGG TTTGGTTTCC TTAAAATGCC AAGTTGGGGG 
CCAGTGGGGC CCACATATAA ATCCTCACCC TGGGAGCCTG GCTGCCTTGC TCTCCTTCCT 
GGGTCTGTCT CTGCCACCTG GTCTGCCACA GATCCATGAT GTGCAGTTCT C7GGAGCAGG 
75 CGCTGGCTGT GCTGGTCACT ACCTTCCACA AGTACTCCTG CCAAGAGGGC GACAAGTTCA 
AGCTGAGTAA GGGGGAAATG AAGGAACTTC TGCACAAGGA GCTGCCCAGC TTTGTGGGGG 
AGAAAGTGGA TGAGGAGGGG CTGAAGAAGC TGATGGGCAG CCTGGATGAG AACAGTGACC 
AGCAGGTGGA CTTCCAGGAG TATGCTGTTT TCCTGGCACT CATCACTGTC ATGTGCAATG 
ACTTCTTCCA GGGCTGCCCA GACCGACCCT GAAGCAGAAC TCTTGACTTC C 
80 TCTCTTGGGC CCAGGACTGT TGATGCCTTT GAGTTTTGTA TTCAATAAAC T 
TGTTGATAAT ATTTTAATTG CTCAGTGATG TTCCATAACC CGGCTGGCTC A 
CTGGGAGATG AGGGCCTCCT GGATCCTGCT CCCTTCTGGG CTCTGACTCT CCTGGAAATC 
TCTCCAAGGC CAGAGCTATG CTTTAGGTCT CAATTTTGGA ATTTCAAACA CCAGCAAAAA 
C GAGATAGGTT GCTGACTTTT ATTTTGTCAA ATAAAGATAT TAAAAAAGGC 



□NO: 10 Protein sequence: 
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Seq ID NO: 11 DNA sequence 
10 Nucleic Acid Accession ff : Eos sequence 
Coding sequence: 33S-626 



C CCCGGTCCAG GATGCCCAGT C 



3 CTCAGGGGCT GCCCTTGAC 



GAGCTGGCAC TCTCTGGGAG GGAGGGGGCT GGGAGGGAAT GAGTGGGAAT GGCAAGAGGC 180 

CAGGGTTTGG TGGGATCAGG TTGAGGCAGG TTTGGTTTCC TTAAAATGCC AAGTTGGGGG 240 

CCAGTGGGGC CCACATATAA ATCCTCACCC TGGGAGCCTG GCTGCCTTGC TCTCCTTCCT 300 

GGGTCTGTCT CTGCCACCTG GTCTGCCACA GATCCATGAT GTGCAGT7CT CTGGAGCAGG 3 60 

CGCTGGCTGT GCTGGTCACT ACCTTCCACA AGTACTCCTG CCAAGAGGGC GACAAGTTCA 42 0 

AGCTGAGTAA GGGGGAAATG AAGGAACTTC TGCACAAGGA GCTGCCCAGC TTTGTGGGGC 48 0 

ATTCCAGAGA ACCATGTGCT GTGAGGGCCT TCCGAGTCCA TCTGTTTAAT CCTGTCATTG 540 

GAGACTTGAG AAACCAGAGC CCAGAAGGGA AAAGTGATTG TCCCAAGATC ACACAGCACT 600 

GGAGAAAGTG GATGAGGAGG GGCTGAAGAA GCTGATGGGC AGCCTGGATG AGAACAGTGA 660 

CCAGCAGGTG GACTTCCAGG AGTATGCTGT TTTCCTGGCA CTCATCACTG TCATGTGCAA 72 0 

TGACTTCTTC CAGGGCTGCC CAGACCGACC CTGAAGCAGA ACTCTTGACT TCCTGCCATG 780 

GATCTCTTGG GCCCAGGACT GTTGATGCCT TTGAGTTTTG TATTCAATAA ACTTTTTTTG 840 

TCTGTTGATA ATATTTTAAT TGCTCAGTGA TGTTCCATAA CCCGGCTGGC TCAGCTGGAG 90 0 

TGCTGGGAGA TGAGGGCCTC CTGGATCCTG CTCCCTTCTG GGCTCTGACT CTCCTGGAAA 960 

TCTCTCCAAG GCCAGAGCTA TGCTTTAGGT CTCAATTTTG GAATTTCAAA CACCAGCAAA 1020 

AAATTGGAAA TCGAGATAGG TTGCTGACTT TTATTTTGTC AAATAAAGAT ATTAAAAAAG 108 0 
GCAAATACCA 

Seq ID NO i 12 Protein 



MMCSSLEQAL AVLVTTFHKY SCQEGDKFKL SKGEMKELLH KELPSFVGHS REPCAVRAFR 
VHLPNPVIGD LRNQSPEGKS DCPKITQHWR KWMRRG 



1 11 21 31 41 51 

I I I I I I 

GTGAGCTCAC CATGTGGGGG TGAGGCTGAG AGAAAACAAG TRCACAGCCA CAGATCCATG 
ATGTGCAGTT CTCTGGAGCA GGCGCTGGCT GTGCTGGTCA CTACCTTCCA CAAGTACTCC 
TGCCAAGAGG GCGACAAGTT CAAGCTGAGT AAGGGGGAAA TGAAGGAACT TCTGCACAAG 
GAGCTGCCCA GCTTTGTGGG GGAGAAAGTG GATGAGGAGG GGCTGAAGAA GCTGATGGGC 
AGCCTGGATG AGAACAGTGA CCAGCAGGTG GACTTCCAGG AGTATGCTGT TTTCCTGGCA 
CTCATCACTG TCATGTGCAA TGACTTCTTC CAGGGCTGCC CAGACCGACC CTGAAGCAGA 
ACTCTTGACT TCCTGCCATG GATCTCTTGG GCCCAGGACT GTTGATGCCT TTGAGTTTTG 
TATTCAATAA ACTTTTTTTG TCTGTTGATA ATATTTTAAT TGCTCAGTGA TGTTCCATAA 
CCCGGCTGGC TCAGCTGGAG TGCTGGGAGA TGAGGGCCTC CTGGATCCTG CTCCCTTCTG 
GGCTCTGACT CTCCTGGAAA TCTCTCCAAG GCCAGAGCTA TGCTTTAGGT CTCAATTTTG 
GAATTTCAAA CACCAGCAAA AAATTGGAAA TCGAGATAGG TTGCTGACTT TTATTTTGTC 
AAATAAAGAT ATTAAAAAAG GCAAATACCA 



Seq ID NO: 15 DNA sequence 
Coding sequence: 62-358 

1 11 21 31 41 51 

I I I I I I 

GGAGGGTGTG CCGCTGAGTC ACTGCCTGGG CATCTGGGCC TGGAACCTCG GCCACAGATC 
CATGATGTGC AGTTCTCTGG AGCAGGCGCT GGCTGTGCTG GTCACTACCT TCCACAAGTA 
CTCCTGCCAA GAGGGCGACA AGTTCAAGCT GAGTAAGGGG GAAATGAAGG AACTTCTGCA 
CAAGGAGCTG CCCAGCTTTG TGGGGGAGAA AGTGGATGAG GAGGGGCTGA AGAAGCTGAT 
GGGCAGCCTG GATGAGAACA GTGACCAGCA GGTGGACTTC CAGGAGTATG CTGTTTTCCT 
GGCACTCATC ACTGTCATGT GCAATGACTT CTTCCAGGGC TGCCCAGACC GACCCTGAAG 
CAGAACTCTT GACTTCCTGC CATGGATCTC TTGGGCCCAG GACTGTTGAT GCCTTTGAGT 
TTTGTATTCA AT AAACTTT T TTTGTCTGTT GATAATATTT TAATTGCTCA GTGATGTTCC 
ATAACCCGGC TGGCTCAGCT. GGAGTGCTGG GAGATGAGGG CCTCCTGGAT CCTGCTCCCT 
TCTGGGCTCT GACTCTCCTG GAAATCTCTC CAAGGCCAGA GCTATGCTTT AGGTCTCAAT 
TTTGGAATTT CAAACACCAG CAAAAAATTG GAAATCGAGA TAGGTTGCTG ACTTTTATTT 
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t, SKGEMKELLH KELP3FVGEK VDEEGLKKLM 



: Eos sequence 



PCT/US02/12476 



AAGACGGATT C 
AGAATAGTTA C 
CCCGAGGCTC T 
CGCGGGCGCA G 
AACCAAGCAC G 
CTGTGGTGTG A 
AATTTTCTGG A 



30 
35 



60 
65 
70 
75 
80 
85 



2 CTAGTGAGAC 



GGCGGGCGTG 



CTGCGCCCCA 



CCTGCTCTGC 
TCCCGGAATT 
AGTCCCCGGC 
GCCCGGCCTC 
GAGAGTCCCG 



GTCAGCCCTC 
GGGTTGGAGC 
GGTAGCAGGA 



CTGTGAACCC 
GCTGGACTCG 
AGGCCTCCCC 
GCGGGGACAG GCACTCGGGC TGGCACTGGC 



3 CCTCTGATAA G 



CCGCTTTCGC 
CTTCTCCTGG 
TCACCGAAAT 
AAGCTTATGT 
ATAAAGCATT 
CGAGTTTGTC 
ATCCATTTAC 
GTCCAGACAC 
ACCTGCAGAT 
TGGAGGAAGG 
TGTATTGGGA 
GCTCCTTAAG 
CGGAAAATCT 
CTATCACATT 



TTTCATCGCA 
GGGACTGAGA 
TCTGAAAAAC 
TAGGAAACAT 
ATGCTCCTGT 
TCAGGATTTG 
ACCCAATTGT 
AAAGTCTATC 



TCCTGCAAAT 
TTTCCGAGAT 
AACCAGAAAA 
AATCTGACAA 
AGCAACCTGC 
TTCCGTCACC 
GACATTATGT 
TACTGCCTGA 
GGTTTGCCAT 
ACATTATCCT 



CCGAGCAGCG 
TGCCGCCTGC 
GCACGCCCGC 
GCACGGGTGG 
TGCTAGGGAT 
TCTGCTGGCT 
GCAGTGCCTC 



AGGAGCCTCG 
GCCTGGACCC 
CCGGACCCCC 
CGGTCGG7GC 



CGGAACACTC 



GGGAAAGCGG 
GTCGTCCTGG 
GGTTGTGGGC 



CCTGTAAAGC 
TTCGCTCCGG 
GAGGAGTTAA 
CCGGTGCAGC 



GGTTAGAAAT 
TTGTGGATTC 
AGCACATCAA 
TTGACTTGTC 
GGATCAAGAC 



AATACATCTG 



GATAACTAAC 
TGTAGGAGAA 
TCTCGAATCT 

CAAACCAGCG 



CTGCAAATCT 
GTAGTGTGGC 
AACATATGAA 



GGAAGGATGA 
CAAACCCAAA 
GGGACACCAC 
GGGAACATCT 
TGGTAATGCT 
TTTTGTTTCA 



CACTCACATG 



CTTCAGTGGT 
CATGTTACCA 
AACAATGGGG 



CTGTCAACCT 
ACCACCACTG 
TCTATAACGG 
ATCACACGGA 
ACTACACTCT 



CAGTGTAGAT 
CATCAACGAA 
TGGATTAAAA 
TTTTACCCGA 
TGAACTGATC 
TCTCCAAGAG 
CAAGAATATT 
GGCCGCACCT 
AGGTGATCCG 
TGAAACAAGC 
G AAG C AG AT C 
CACTGTGCAT 
GTGCATTCCA 
GGCAATATTG 
GTACCACGGC 
AATAGCCAAG 



TGCAGCGACC 
CCTGAGAACA 
GATGATGTTG 



AACAAACTGA 



GTTCCTAATA 
CACACACAGG 

TCTTGTGTGG 



TTCACTGTGA 



TTATCCTGAT 



I AATGAAATCC 
T AAGTTGGCAA 



AACTGCAGCG 
CGTCACTGAT 
TGTGGTGGGA 
GTTTGGCATG 



TGCCTCCAGC 
AATGAGTATG 
GACGATGGTG 
AATGACATCG 
AAAACCGGTC 



CTTATCCCGG 



GTAACTCTCA 
TTACAGTAGT 
AAAGTGTGCT 
TTGACCTGCA 
GTCTAAT CTA 
TTCAGAGGGT 
ACCATCACTT 




TTAGCCAGCA 



CATGTAACAC 
TTGACTTTTT 
TGGGACTTGG 
TTAGCTTAGG 
AAACAAAACA 



CCTGGGAGCA 
CCCCCTACAA 
CCAGCAGCAA 
TGCTGAGAGG 
ACACCTGTTT 
CAGGCAGTAT 
TGTTCCTTTT 
AATCAGCTCT 
TTTTAAAAAT 
TCTATAGATT 
CAAGGCATTA 
TCCATAATGA 
ATGAAGACCT 



ATTTCAGAAT 
TTATTTGCCC 
ATCTCATGCT 
GAATGGCTGG 
CATACTGTCA 
AGAGGTGGCA 
GCAGCCTTAG 
CATTCACTTT 



TCTGAGAGTC 
AAACAAAACA 
GCAGCAACAG 
AACTAAGAGT 
TCTGCGATCC 



CCTGCTGTGA 
TACTGCTGGG 
GGTCGCTAAT 



AACAAATGAA 
CTGTTTTGTT 
GGAATATATG 
ACCTGCTTTT 
CAGAACTGCA 



TTATCAGGAG 
GACAGTTAGA 
TTATTTTTTT 
TTTAACTAGT 
ATCTTAATAA 
ATATTTTATA 



GACTTCAGAG 
CATGCACACA 
TGGAAATAGT 
CCAACACAGT 
ACCAGGATCC 
CTGCATCCTT 
CCTATGGATT 



GAATATATGC 
TTCTGCATCC 
TGACCTTTGT 
TGGCTATCCC 



TAGAAGTCTG 
GCTTTTCTAC 
AGATTCTAAG 
TAGGAAAGCT 
TTTATAATGT 



TCTGAAAAGG 

TGTCCTGACC 
CCTTCTTCAT 
TGACCCATGG 



GACGCCATAG 
TGCACAAATG 
CAGAAACATT 
ATTTAGGTAC 
TACATTAGCC 
GCAGCATTTC 



CTGCAATTTA GCTTTAAGG? 
GTTTTGAATC CTCTGTAAAC 
CACTTGATAT AAAAAGGATA 
ACTAAATACG TTATTGCTTG 
ACTTGGCTAC TTCATACCCA 



3240 
3300 
3360 
3420 
34 8 0 
3540 
3600 
3660 
3720 
3780 

3900 
3960 
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TGCCTTAAAG AGGGGCAGTT TCTCAAAAGC AGAAACATGC CGCCAGTTCT CAAGTTTTCC 4200 

TCCTAACTCC ATTTGAATGT AAGGGCAGCT GGCCCCCAAT GTGGGGAGGT CCGAACATTT 4260 

TCTGAATTCC CATTTTCTTG TTCGCGGCTA AATGACAGTT TCTGTCATTA CTTAGATTCC 432 0 

GATCTTTCCC AAAGGTGTTG ATTTACAAAG AGGCCAGCTA ATAGCAGAAA TCATGACCCT 4380 

GAAAGAGAGA TGAAATTCAA GCTGTGAGCC AGGCAGGAGC TCAGTATGGC AAAGGTTCTT 4440 

GAGAATCAGC CATTTGGTAC AAAAAAGATT TTTAAAGCTT TTATGTTATA CCATGGAGCC 4500 

ATAGAAAGGC TATGGATTGT TTAAGAACTA TTTTAAAGTG TTCCAGACCC AAAAAGGAAA 45S0 

AATAAAAAAA AAGGAATATT TGTACCCAAC AGCTAGAAGG ATTGCAAGGT AGATTTTTGT 462 0 

TTTAAAATGG AGAGAAGTGG ACAGATAAGG CCATTTAATA TATCAAAGAT CAGTTGACAT 4680 
A ATGATGAAAA C 
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I 



I 



I 



I 



MSSWIRWHGP AMARLWGFCW LWGFWRAAF ACPTSCKCSA SRIWCSDPSP GIVAFPRLEP 

NSVDPENITE IFIANQKRLE I INEDDVEAY VGLRNLTIVD SGLKFVAHKA FLKNSNLQHI 

NFTRNKLTSL SRKHFRHLDL SELILVGNPF TCSCDIMNIK TLQ3AKSSPD TQDLYCLNES 

SKNIPLANLQ IPNCGLPSAN LAAPNLTVEE GKSITLSCSV AGDPVPNMYW DVGNLV3KHM 

NETSHTQGSL RITNISSDDS GKQISCVAEN LVGEDQDSVN LTVHFAPTIT FLESPTSDHH 

WCIPFTVKGN PKPALQWFYN GAILNESKYI CTKIHVTNHT EYHGCLQLDN PTHMNNGDYT 

LIAKNEYGKD EKQISAHFMG WPGIDDGANP NYPDVIYEDY GTAANDIGDT TNRSNEIPST 

DVTDKTGREH LSVYAWVIA SWGFCLLVM LFLLKLARHS KFGMKGFVLF HKIPLDG 



Seq ID NO: 19 DNA sequence 

Coding sequence: 82-3600 
21 



50 
55 



I 



I 



CAGTCCCTGC 
ATGGATTTAG 
ATCACAAACT 
CCTCCCAGCG 
GGCCATGCTG 



GATCTGGAGA AAGAACGGCA GAACACACAG CAAGGAAAGG 
ATTGGCTGAA GATGAGACCA TTCTTCCTCT TGTGTTTTGC 
CCCAACAAGC CTGCTCCCGT GGGGCCTGCT ATCCACCTGT 
TCTCCGAGCT TCATCTACCT GTGGACTGAC 
CCCAGTATGG CGAGTGGCAG ATGAAATGCT GCAAGTGTGA 
ACTACAGTCA CCGAGTAGAG AATGTGGCTT CATCCTCOGG 
CCCAGAATGA TGTGAACCCT GTCTCTCTGC 
AAGAAGTCAT 

CAGACTTCGG TAAGACCTGG CGAGTGTACC AGTACCTGGC 
CCGCCAGGGT CGGCCTCAGA GCTGGCAGGA 



TGTCTGGGAT 



AAGCCGGGCT 
AACATCCTGG 
CTGCCCAAOG 

AGTGGCCAGG GCTGTGAACC 



ATGTCTGTGT 
ACAACAACCG 
ACTGCAATGG 
GGGCATATGG 
rGTCA r 
TCTCCTGCGA 
GGCAGTGTGT 
TCACTGGACT 
GGTCCCGGAG 
TGGTGGGTCC 



TCCAGCAACT CAAAGTCAAA AAATTCAAGA 
CTGGCCCCTG TGCCCCAAAG 
AGGGGAGCTG 
ACCCAAGCCT GGGGCCTCTG CAGGCCCCTC 
CTGCCAGCAC AACACTGCCG GCCCAAATTG 
GCCCTGGAGA 
GCACTCAGAG 



CCTGCCTGGC 12 0 

TGGGGACCTG 180 

CAAGCCTGAG 240 

CTCCAGGCAG 3 00 

CCCCATGCGC 360 

GGACAGGAGA 420 

TGCCGACTGC 
TGTTCGGTGC 
ACTTAACCTT 
GGTGGGGGAG 
GGGCTACCAC 



CACCGCTGTG 
TGAGCGCTGT 
CCATGAATGC 



GCACTATTTC CGGAACCGGC 
GTGTGATCCG GATGGGGCAG 
GTGCAAGGAG CATGTGCAGG GAGAGCGCTG 
CACCTACGCC AACCCGCAGG GCTGCCACCG 



CAAATGTGAC CAGTGTGCTC CCTACCACTG 
C GTGTGCCTGC GACCCGCACA ACTCCCCTCA 
T GCCCTGTCGG GAAGGCTTTG GTGGCCTGAT 
GCAGCCATCC GCCAGTGTCC AGACCGGACC TATGGAGACG TGGCCACAGG 
3 AACAGAGGGC CCGGGCTGCG ACAAGGCATC 



GGGCTGGAGG 
ATCCGAGCAG 
GCCATCCTCT 



ACTATGTATC 



GACAGCTCGC 



CAGGTGGCTG 



GTGAGCGCCA 
CAGGTCCGGG 
• GAGGCCG1GC 



TTCTCAGCAG CCCCGCAGTC 
CCCTCAGGCG AACTCTCCAG 
CCCTTCCGAG AGACCTGGAG 
AGAGGAAGAG GGAGCAGTTT 
TGCTGAGCAC AGCCTACGAG 
GCCTTTTGGA CCAGCTCAGG 
GAGGAGGAGG AGGCACCGGC 
TGCCTGACCT GACACCCACC 



GCTGCAGGGG 
AGCAGCTGCG GGGCTTCAAT 
AGGAATCTGC CTCACAGATT 



I TGCTTCCAGA 
; AATGCCACCG 
3 ATCCTAGATG 



ACAGAGCAGG 



AGTCTTGACA 
GAAAAAATAA 
CAGTCAGCCC 



AGCCCCAAGC 
TTCAACAAGC 
GAGCTATGTC 
AGGGCCGGTG 
GCCCAGCTCC 
CAATCCAGTG 



CCTATGATGC 
CCAGCCTGTG 
CAAAGAGTAA 
AGGTGGCTCA 
TGGATCTGCC 
GAAGCTTCAA 
GCAGTGCTGA 
AGGCTGCTCA 
GAGAGGCAGA 



CTGTGACTGC 
CTGCCTTTGT 
GAAGCTGGCC 
GCCCACAGTG 

ATGCCGAGCC 
AGGCCGCTGC 
CTACTGCAAT 



CATGCAGTGG 



GCCCAAGAGG 



CGCGTGCCCG 
AGGGCCAGGT 
CTCAGGACAC 
AGGTTCAGCA 
GTGACTTCTG 
CAGTCCAGGC 
GATTTGAGAG 



CCGGTTGCAG 
GGAAGATGTG 
CATGCAAGGC 
GGTACTGCGG 
GACACGGATG 
CCAGCAGCTT 
AATAAAACAA 



ACTGATGCAG 
GACTCAGCTA 
AACGTGGACT 
GCTGAGGCTG 
GTTGGGAACC 
ACCAGCCGCT 
CCAGCAGAAA 
GAGGAGCTCC 
GCGGAAGGTG 
AAGTATGCTG 



AGCGGACCAG 
CCCAGCGCTT 
GCACACGGCT 



TGGTGCTGTC 
AGGAAGCCAG 
TGCGGCAGGG 
CCCTTCGGCT 
AGCTGGTGAC 
GCCACCAAGC 
CCAGCGAGCA 



GATTGAGCAG 
GGTGGCCAGT 
CCTGGAGGAG 
TGGTCTCCTT 
TCCTTCAGGA 
GCAGGTCTCC 
GAGGCTGG7G 
GAGGCTGGAG 
CTCCAGGCAG 
TGGCACAGCC 

GCAGATGA7T 
GGAGACCCAG 
CCTAATCCAG 
GGAGGTCAGC 
GAAGATGAAT 
CCAGACCAAG 
GAGCCGAGCC 
GACAGTGGCA 
TATCCAGGAC 
AAGCATGACC 



2280 
2340 



GGCATTGAGT 3300 
k CCGGTTGGGT 3360 
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A TGCTGGGTGA GCAGGGTGCC CGGATCCAGA CTCTCAAGAC AGAGGCAGAG 
V3ACCAT GGAGATGATG GACAGGATGA ftAGACATGGA GTTGGAGCTC 
A GCCAGGCCAT CATGCTGCGC TCGGCGGACC TGACAGGACT GGAGAAGOGT 
3TGACCA CATCAATGGG CGCGTGCTCT ACTAT3-CCA3 C73CAAGTGA 
TGCTACAGCT TCCAGCCCGT TGCCCCACTC ATCTGCCGCC TTTGCI7T73 37T3-GGG3CA 
GATTGGGTTG GAATGCTTTC CATCTCCAGG AGACTTTCAT GCAGCCTAAA GTACAGCCTG 
GACCACCCCT GCTGTGTAGC TAGTAAGA7 ? ACXTC,\3C7 ;CAG3TGAG: CT JAGC3AAT 
GGGACAGTTA CACTTGACAG ACAAA3ATGG TGGAGATTC-G 3ATG3CATT3 AA. 37AA3AC 
CTCTCAAGTC AAGGAAGCTG GGCTGGGCAG TATCCCCCGC CTTTAGTTCT CCACTGG3GA 
GGAATCCTGG ACCAAGCACA AAAACTTAAC AAAAGTGATG TAAAAATGAA AAGCCAAATA 
AAAATCTTTG G 
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I ID N 



I I 
MRPFFLLCFA LPGLLHAQQA 
EWQMKCCKCD SRQPHNYYSH 
MEFQGPMPAG MLIERSSDFG 
NARLNGGKVQ LNLMDLVSGI 
VSQLRLQGSC FCHGHADRCA 
PWRPAEGQDA HECQRCDCNG 
HYFRNRRPGA SIQETCISCE 



CSRGACYPPV 



KTWRVYQYLA 
PATQSQKIQE 
PKPGASAGPS 
HSETCHFDPA 
CDPDGAVPGA 



PMRWWQSQND V 



VGEITNLRW 



CACDPHNSPQ 
TEGPGCDKAS 
RLRNATASLW 
TLQGLQLDLP 
AYEQSAQAAQ 
TPTFNKLCGN 
GFNAQLQRTR 
DPDTDAATIQ 



LEEETLSLPR 
QVSDSSRLLD 
SRQMACTPIS 
QMIRAAEESA 
EVSEAVLALW 
SRAHAVEGQV 
SMTKQLGDFW 
RLGQSSMLGE 
EKRVEQIRDH 



PCREGFGGLM 
GPRCDQCQRG 
ASRILDAKSK 



QLRDSRREAE 



SQIQSSAQRL 
LPTDSATVLQ 
EDWGNLRQG 
TRMEELRHQA 
QGARIQSVKT 
INGRVLYYAT 



FTRLAPVPQR 
CQHNTAGPNC 
GVCDNCRDHT 
CKEHVQGERC 
KCDQCAPYHW 
DRTYGDVATG 
CKPCFQTYDA 
PAVTEQEVAQ 
EQFEKISSAD 
GTGSPKLVAL 
VLPRAGGAFL 
ETQVSASRSQ MEEDVRRTRL 
RLPNVDLVLS 
MQGTSRSLRL 
QQLAEGASEQ 



CSAAAIRQCP 
YCNRYPVCVA 
IEQIRAVLSS 
GLLTMYQRKR 
RLVRQAGGGG 



3340 
3930 
39S0 



KPETYCTQYG 
DRRFQLQEVM 
VRCQSLPQRP 
GYHPPSAYYA 



DLCKPGFTGL 
KLASGQGCEP 
CRACDCDFRG 



VASAILSLRR 
PSGAFRMLST 
RLEMSSLPDL 
MAGQVAEQLR 
LIQQVRDFLT 
QTKQDIARAR 
IQDRVAEVQQ 
ALSAQEGFER 



C CCTGACCCTT A 



CCAGAGGTTT 
ATTGACTTGA 
AGCATGGACT 
ACGAACCTGG 
AGTCCCTATA 



TGGACGTATT 
CAGATCAAGG 
AAAAAAGCTG 
GAATT CAACG 
CATGCCCAGT 
CCACCCCAGG 



ACTTTGTGGA 
GGCTCCTGAA 



TGAACCATCA 



TGATCACCCC A 



CTGGAACAGC 
GAAGATGGTG 
GACCTGAGTG 
CAGCAGATTC 
AGCGTCACGG 



C TACTGCCAAA 



ACTTCACGGT GTGCCACCCT 
ACCCAGCTCA TTTCTCTTGG 
AGACAAATGA ATTCCTCAGT 



GGGCAAGTCC 



AGGGACAGAT 
ATGTAGAAGA 
TTGGCACTGA 
GGAK3AACCG 
TGGGCCGACG 



AAGCGGTGCC 
AGT CATTTGA 
GGAAGACAGA 



CGACAAACAA GATTGAGATT 
ACCCCATGTG GCCACAGTAC 
AGAACGGCTC C 

TCCCCTCC.AA C 

GTCGGCCACC 
TTGCAAAGAC ATGCCCCATC 
GCCTGTCTAC 



TGCCCCTCCT 
TCCCATCACA 
ATTCACGACA 

CCGTCCAATT TTAATCATTG TTACTCTGGA 



TTCGAGTAGA 
GTGTGCTGGT 
TCATGTG 



ACCTTATGAG 



AGGAAGGCGG ATGAAGATAG 
GATGGTACGA 
AAACGAAGAT 
GAAATGCTGT 
ATTGAAACGT 
CTTTCAGCCT 
GACGTCTTCT 
TCTATATTTT 
TGTGTGTGCG 



rTTGAG GCCCGGATCT GTGCTTGCCC 



TTACAAGAAA 
GAACCACTGT 
GAAAGGGGCA 
AATTCACAGG 
AAAAAAGTTG 



TACTGCTGGG 
TTTGTGAGAA 
GCTGTGTACC 
CATGAAACCC 
CTCATTTTGT 
TGTTTACCAT 



CCCCAGATGA 
TGAAGATCAA 
ACAGGCAACA 
GCTTCAGGAA 
TTAGACATTC 
AAGIGTGTGT 
TGTGTATCTA 
CAAAGGCACA 
GGATGTTTTC 
GTTTGTCTGT 
TTAAGATGTT 
GAAGCTTTTG 
TTATTGTCTG 
GCTGGTCATG 
CAGCGAGGTG 
CTTGCATTAT 



CATCAGAAAG 
TCGTCAGAAC 
TGAACTGTTA 
AGAGTCCCTG 
GCAACAGCAG 
TGAGCTTGTG 
CAAGCCCCCA 
GTTGTATTTC 



AACCAGAGAT 
AGGAAGAGAC 
AAAGAACGGT 
ATCCATCAAG 
TACTTACCAG TGAGGGGCCG TGAGACTTAT 
GAACTCATGC AGTACCTTCC TCAGCACACA 
CAGCACCAGC ACTTACTTCA GAAACATCTC 
GAGCCCCGGA GAGAAACTCC AAAACAATCT 
AACCGATCAG TGTACCCATA GAGCCCTATC 
CATGTGTATA TGTGAGTGTG TGTGTGTGTA 



TGCAGATTTT 



AGCAGGTCTC 



TAATAATATT 



TTGTGTCCTC 



TGGAAGACCT 



TATTCAAAGC 
ATTAGAGCTT 
TCAGTGCATT 
AAATCAGCAC 



ACTACAAAAA 
GAAAGACAAA 
TCAAAATAGA 
CTATCCCTCA 
TAGCCAGGAG 
TCCTGGACTG 



ACAGGACTTG 
TGAGAGAATC 
GTATCCTTAG 
TTGTTTCCTG 
CTTTTCTGTC 
AAACTTAAGA 
AGTTGTAGGT 
GCAAGTAGTA 
AAAGTAATCA 
CCCTCATGTG 
GGCATCTGTT 



AAGACACTTT 
TTTTGAAGGG 
ACCGGCCATT 



TTCTTCTGTT 



TCCACCCCAG 
ATTTGAAGCC 
AGCCTACCTA 
ACTTACGTTT 
GAAATTAAAG 



GACTGAGAGA 
AGAAACGAAG 
ACTTTGTGGG 
TAGGTAGAAC 
ATGCTAAA3T 
TGGCCCCCAT 



GTCAGGIGGG 
GTTTTTCTAA 
AGAAAAGGAG 
CTCAGTCAGA 
GTGTCAAGTG 2 040 



CTCTCACAAA A 



2400 

2520 
2580 
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TTCCATTTTA AAAACATATT 2 640 

C CACACCCAGT 2700 

3 TCACCAAGAC AATGATTTCT TGTTATTGAG GCTGTTGCTT 2 7 SO 
A ACTTTTGCAT CTTGGTTTAA AAGAAA 
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MSQSTQTNEF LSPEVFQHIW DFLEQPICSV QPIDLNFVDE PSSDGATNKI EISMDCIRMQ 

P QYTNLGLLNS MDQQIQNGSS STSPYNTDHA QNSVTAPSPY AQPSSTFDAL 

T DYPGPHSFDV SFQQSSTAKS ATWTY3TELK KLYCQIAXTC PIQIKVMTPP 

PQGAVIRAMP VYKKAEHVTE WKRCPNHEL SREFNEGQIA PPSHLIRVEG NSHAQYVEDP 

ITGRQSVLVP YEPPQVGTEF TTVLYNFMCN SSCVGGMNRR PILIIVTLET RDGQVLGRRC 

FEARICACPG RDRKADEDSI RKQQVSDSTK NGDGTKRPFR QNTHGIQMTS IKKRRSPDDE 

LLYLPVRGRE TYEMLLKIKE SLELMQYLPQ HTIETYRQQQ QQQHQHLLQK HLLSACFRNE 
LVEPRRETPK QSDVFFRHSK PPNRSVYP 



Seq ID NO: 2 3 DNA sequence 

Nucleic Acid Accession #: NM_001944.1 

Coding sequence: 34-3083 



I 

TTTTCTTAGA 
TTTCACCAGG 
CCATCTTCGT 
ATGATGAAGA 
AATTTGCCAA 
TTACTTCAGA 
ATCAGCCGCC 



CATTAACTGC 
GAAATCAGAG 
GGTGGT CAT A 



ACCCTGCAGA 
TTACCAAGCA 
TTTTGGAATC 



ATCCTCCAGT 



TGTAGAGAAA 



AAATTGCCTT 
GAAACACTGG 
ATCGTCTGGT 
GTAATATTAA 
CAGCACGTAT 
TGGATGAAGA 
GAAATTGGTT 
AGGCTCTAGA 
CTGAATTTCA 
AGGTAATAAA 
AAAAAGGCAT 
ATGAGGACAC 



GATGATACTA 
CAAAATTGTC 
GGAAGTCCGT 
TGTGAGTGGT 
AGTGAAAGAT 
TGAAGAAAAT 



AGACGGCTGG 
ACAATGATGG 
TTGGTTCATG 
ATGCAACAAG 
GAAGGAGAAG 
ACCCAGAAAA 
TTTGTTGTTG 
ACTCCAAGCT 
CCACTTATAC 
CAAATTTTCA 
AATGCCACAG 
TCTCAGGAAC 



CAGGATAGAA 
GGCTCTTCCC 
GAGAATTGCG 
CTAAAAGAAG 



GCAGCGGCTC ACTTGGACTT 



TCACCTACCG 



TCCTGATCAC 
TAACGGTTAA 
TGGGTGAAAT 



AATAGAGACT 
GCAAAAACGT 
AAGAAACCCA 
AATCTCTGGA 
TGGAGATATT 



TGAAATACAA 



GCAGACAAAG 
GTCAACGATA 
ATTTTAAGTT 
AATTGGCTTG 



ATTCTCTTGA 



ACTTCCCAAT 
CTGAATTACT 
CAGTATATTT 



AATITTGGAT 
TGAAGAAAAT 
ACCAAACCAC 
ACCCATGTTC 
CCGAGAGCAA 
ACTATCAACT 



AAAGGTCAAT 
GAATGQGTGA 
ATTGCCAAGA 
GTGGGAATCG 
AACATAACAG 
CTAAATGCCC 



AGTGCCTCAA 
TTGAATTCTA 
CTCCTAAGCA 
GCTAGCAGCT 
CAATGTGAAT 
TCTCAGTATT 



GGGAATGAAG 



TGTAAGAGAA GGAATTGCAT 
AAATTGGTGG 



ATTCTACTTT 
CGGGIAAAAC 
CAACAGCTGT 
CTAGAACACT 
TAAAGTTGCC 



TTCTACAGGC A 



TCAAATATGT 
\ CAGCTGAGGT 



AACGATGGTG 



ACAATCGGTG 
GCATCTGTGG 
CAGGGAGGCT 
TGGCCCCCCT 
GTGGTTTTAT 
GAGCCCATCC 
GAGCCGATTT 
TGGAAGGCAC 
GTGCTGCAGG 
CTGGAGTTGG 
GAGGAACCAA 
TTTCTCAGAA 
TGTTGATCTA 



TGCCGTATGG 
GATACCTCCT 
TGAGATGCCA 
AACTTCTTAC 



CGCAGCTTGA 
CCAACCACAA 
GCCATCGGCC 
ACCTGTGACT 
GATGGCTCAG 
GAAATCACAA 



TCTGCTGTTG 
CCCAGTTCCT 
TGAAGACAAG 
CATGGAAAGT TCTGAAGTTT GTA( 
TTCAGGAATG 
CTTTGCAACA 
CATCTGTTCC 
TAAGGACTAC 
AGCATTTGCC 



CCTATACATT 
CCCTCAATGC 
ACATCTCCCT 
CACTGGAAGT 
GCCCTGGGAC 
TGCTGCTCCT 



TCTGGCCATA 
CGATTTCAAT 
ACCTTCCGTG 
TGCACTGGAA 
TACCTCGGCC 



GATGAATACA 
GACAATTGTC 
GTTGTCTCCG 
GAT CAACCTG 



CTGTCAGTGT 
CAGGTATGGC 
TGGTCTCCTG 



GACAGTCAGA 
GACAACAGGG 
AGGCCGCACT 



T TCATCAGTGG GGAATTGAAG 



CCTCTAAAGA 
CTGGGTCTGT 



CAGCGGTTAT 
TAAGTGCCAG 
CCAGCCAGCT 



r AGCTGGCCCA 



TCTTAAAGTT 



TAGCTTCTCT 
TTTCAAAACC 



GCTGATGGGG 
TGTGCGGAGG 
GGCGCAGATG 
GACCTGGATG 
AGCCTTGGTG 
GGGATTGAAT 
ACTTTGTCAG 
GTTTCCATCC 

ATAGTGACAG 
ACGCAGCTAC 
TG AC CAGAAT 
AAAATAGCAT 
CATAAACTGA 
GTCACTCCTA 
CTAAAATCAT 



CTAAGCTTGG A 
CAGGAGCTGC T 
CTGGAA 



CGATAAGCAT 
AAGACGATGG 
CCACTGGTTC 
ACAGCTTCTT 
TTGATGGTGA 



GAATTTTCTG 
CCAGGAAGCA 
TCCTGTGGGC 
GGACTCACTT 
AGGCAAAGAA 



GACTCCTACT 
AATGACTGCT 
TCCGTGGGTT 
GGACCCAAAT 
GTTCAGCCAC 



AAAGGGTGAT 
GAGGGTCACA 
GAGCTGGAAT 



TTGTCCGCCT 
AACTATTTAG 
GGCTTTGATC 



TGTACAGAGG 



TCACGATTAT 
ATTCTCAAGT 
ATTCGC 



ACTATTCAAA T 



24S0 
2520 

2S40 
2700 
2760 
2320 
2380 
2940 
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25 
30 
35 



I 



I 



I 
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I AKITSDYQAT 
h NAQGLDVEKP 
L NSKIAFKIVS 
DKDGEGLSTQ CECNIKVKDV 
WLAVYFFTSG NEGNWFEIQT 
SRYRVQSTPV TIQVINVREG 



GQYDE3EM7M QQAXRRQKRE WVKFAKPCRE 
QKITYRISGV GIDQPPFGIF WDKNTGDIN ITAIVDREET 
LILTVKILDI NDNPPVFSQQ IFMGEIEENS ASNSLVMIIJI 
QEPAGTPMFL LSRNTGEVRT LTNSLDREQA S S YRLWSGA 
NDNFPMFRDS QYSARIEENI LSSELLRFQV TDLDEEYTDN 
DPRTNEGILK WKALDYEQL QSVKLSIAVK NKAEFHQSVI 
TVQKGISSKK LVDYILGTYQ AIDEDTNKAA 



STGGTNKDYA 



T ANGADFMESS E 
3 TVSGAASGFG AATGVGICSS O 
N DCLLIYDNEG A 
S LGVDGEGKEV QPPSKDSGYG IESCGHPIEV QQTGFVKCQT 
LSGSQGASAL SASGSVQPAV SIPDPLQHGN YLVTETYSAS GSLVQPSTAG FDPLLTQNVI 
r QLRGSHTMLC TEDPCSRLI 



GSEGTIHQWG IEGAHPEDKE 
MTTKLGAATE 

□ SYFSQKAFAC 



D NO: 2 5 DNA sequence 



Coding sequence: 5S-1S42 



: Eos sequence 



I I 
AGTATCCCAG GAGGAGCAAG 
GCAAGGGATC CTTTCTCCGC 
CATGTTTGAG TCCACAGCTG 
CTGCTCTGTC GTCTCTACCT 
GGAGAAGGTG AAAGTATACT 
GGAAGATCAG GGTTGTGTCC 
GGACTCTTTT GCCCTGAAGA 
CTTTTCCCAG ATCTTTGGGC 



TGGCACGTCT TCGGACCTAG GCTGCCCCTG 
CAGCGGGCTT 
CAGATTTGGG 



GCTGTCCGAT 
GTCTGTGGTA 
CAAGCAGCAG 
TGAGGGTTAG GCCCTTGTTA 



GTCCCTGGCG CTGATCTTCA 
GCCCTTGCTC TCCAATGAGG 
GAAGAAGCTG TCCCTGCTAA 
GAGGAGTGTC TACATCGAAA 
TGGGCTCTCT TCTATCAGTC 
ATGGGCACAG CCAGACACTG 
GATCTCATTC TTTGAGATCT 



GCAATGAACG 
CAGAAGTGGG 
T CAAAGGGC A 
CGATTCAAGG 
ATAGCCTCCA 
TAATCTGGCT 
ATGGAGGCCT 
GTCGGATAGG 
AGTGTACCAG 
CCCCACTACC 



TGTGGAGACC 
GGGAATTGGC 
ACAGGCATCC 
GAACTGGCTC 
TACCATCAAG 
AGGCCAACTT 
AGACAGCAAG 
CCAAGAGGAG 
TACCAGCACC 



GACGATGTCG 
CGCAAGAACC 
GTTCCATGTG 
CCTTCAGAGT 



TAGTTTCTCC 
TGCTATCAGA 
AGGACAGTAT 
TGGAACGACA 
AAGCACCCAA 



TAACTGTGAA 
ATGGAGTCAC 
TTCTCCCCCG 
CTGATCTGAA 



TCGTAAGAAC 



TGGATTCATG 



TGCGGCTATG 
TGCAAGATGC 
CCAGCACCCA 



TGTCCCGGCA 
GCTTTATGAC 
CGAGGATCAA 



CTGGATGAAA 



GTGGCATTGC 



CAAGTTGACT 
CAATGTGAAT 
CATTGCTAGC 
ACTCGTTCAT 
CAGACACAGG 
AGGAGCTCCT 



CGAGTGTTCC 
CCCTGTGCAT 
CAGGTGACTT 
CAAGGAACAT 
CCTTGATGAT 



C TGGCTGGCTC 
A ACATTAACAC 
C AGAACCGGTC 



CCTCAACCAG 
GGGGGAAGGA 
AGAGCGCTGC 
CTCTCTACAC 



CTATTAGAAC C 

AATGGCAATC CCTA 
TGGAAGCT 

AACTCCAGCC GCAGTCACAG 

GATATAGTCC C 



CACAGGCCGA G 



A ACTGGGATIC C 



AACAGCGGGA 
TGTATGAAGA 
TTCAGGAGCG 
AGTCAGTGGC 
CAGCTTCTGC 
CAGAGCTAAA 
CCTCAGCCAA 
TAAGGCTGTT 
CTTGTTGCCA 
TCTTAATCAA 
TGGACCTTCG 
AAGGCCAGGT 
AACAACCACC 
AAAGCTCAAC 
TCAAATCTGG 



GGAGAT GCAT 
ACAGTGGTGC 
AAAACTAAAT 
GGATGAAAAG 



GATATTGAAA 
GAAGCCATGA 
CTCCGAGATG 
AGTGAACATT 
ATCCTCAAGG 
ATTGAAGAGC 



AGACACTGCT 



AGTCACTGAC 
TAGAAGCTCT 
AATTGGCCCT 
AGGTTAAAGC 
ATAAGTATCA 



GCGGACAGAG 
CAGCACTGGG 
ACAGGACCAG 
GAAGAAGGCA 
TTCTGCCAAA 



CTTCAGAAAC 
GCAGGAAAAC 
ACTCTGGCTG 
GCATGTATTG 
AAGCGCCTTG 



TTGGTGAGTC 
TTCGTCAAGC 
AACTGCAGAA 
CTGAGCAGTA 
GTACCAACCA 



TATCAGGAAT TATATCCAGG 
TATAACCACC TATGTAATCT 
GCACACAAAA ACAGTTATAT 
GTAGCAAAAT CATTAAAACA 



CATGTTGTTG 
AATTATAAAA 



GGATCCTACG 
AAGGCTGTGG 
AGAAATAGGT 
CAGACACTAG 
TTTTTTTTTA 
ATTGTTCACA 
GGGACAGAAA 



CATCTCCATG 
TTTGAAGGAA 
TGAGATGGTA 
AAAGGAACTA 
AAGTTTTTAC 
CTTGCAGGAA 
ACGGCGGTCA 
TAAATTACAG 
GAAAATGTIA 
AGAAGAGGGC 
TCTCCAATCA 
CTTGACCACT 
CAACATGGIG 
TCATACTGTG 
GGAAAATCAG 
TCCCCGAACA 
CTCACGGCGT 
GGAAAGAGAA 
CTCTTTTATG 
CTTTTTTCTC 



CGACAGGAAA 
GAACAGATGC 
TTGGAGGAAA 
CAAGAAGAGA 
GCCAGACAAC 



CAGTGCAAAG 
GAACCACCAC 
CAGAAGAATA 



TGTGATGACA 



CAACCAAACC 
CCAACCTGCC 
TCCCCTITAC 
GAGCAGTCAT 
CTTTACCATA 
ACTTTTGTAT 
TGATTTCTAT 
TTTTTTATTG AATTCCAAAT 3 000 



2040 
2100 
2160 
2220 
2280 
2340 
2400 
24E0 
2520 
2530 

2700 
2760 
2820 



■J SPMFESTAAD LGSWRKNLL SDCSWSTSL 3DKQQVPSED 
E RQEDQGCVRI ENVETLVLQA PKDSFALKSN SRGIGQATHR 
FTFSQIFGPE VGQASFFNLT VKEMVKDVLK GQNWLIYTYG VTNSGKTHTI QGTIKDGGIL 
PRSLALIFNS LQGQLHPTPD LKPLLSNEVI WLDSKQIRQE EHKKLSLLNG GLQEEELST3 
LKRSVYIESR IGTSTSFDSG IAGLSSISQC TSSSQLDETS HRWAQPDTAP Ii 
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IWISPFEIYN ELLYDLLEPP SQQRKRQTLR LCEDQNGNPY VKDLNWIHVQ DAEEAWKLLK 
VGRKNQSFAS THLNQNSSRS HSIFSIRILH LQGEGDIVPK ISELSLCDLA G3ERCKDQKS 
GERLKEAGNI NTSLHTLGRC IAALRQNQQN RSKQNLVPFR DSKLTRVFQG FFTGRGRSCM 
IVNVNPCAST YDETLHVAKF SAIASQVTCA CPTYATGIPI PALVHQGT 
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CTTCCCCTGA 



15 
20 



GCTTTCCAAG 
GCTGACATTT 
AAAGGTGGAA 
TTCGATGAGG 
GTTCACGAGA 



ACAGCTCTAC 
TTTATGGCCT TGAGATAAAC 
TGAAGGAAAA AATCCAAGAA 
ACACATCTAC CCTGGAGATG 
TCAGGGAAAT GCCAGGGGGG 
ATTACACACC TGACATGAAC 
TATGGAGTAA 
TGGTGGTTTT 
TCCTAGCCCA 
ACGAATTCTG 
TTGGCCATTC 



CTTGGGTCT 



GGCATTCAGT C 



TGCCCGTGGA 
TGCTTTTGGA 
GACTACACAT 
CTTAGGTCTT 
TGACATCAAC 



GCTCATGGAG 
CCTGGATCTG 
TCAGGAGGCA 
GGCCATTCTA 
ACATTTCGCC 



TCAGAACCAG 
AAGATCTTTT 
AGTGTTAATT 
GAAATTGAAG 
AATTTAAGAC 
GTGAAAAAAA 
GATAACCAGT 



CTCTCTGTGA C 



TGGTTTTTGT 



TTGATGCAGC 
AGAACTTCCA 
AAACACTGAA 



TTATATAAAA 
CTCTACTATT 
CTCTGTAAGT 
TAAAATTAAG 



ACTTAGAGAT 
TACATAATAT 
AAGTTTGAAA 
TGCTTCCTAA 
TATATATATT 



AGTTTTTCTT 
TTATCCCAAG 
TGTTTTTAAC 
TGATGAAAGG 
AGGAATCGGG 
CCAAGGATCT 
AAGCAATAGC 
CAGCTTAATA 
ATGTATCATA 
TTTTCAATTT 



TGGCTGAAGG 
ACCTTGCCAT 
TTTAAAGATG 
AGCATACATT 



GGAAACATTA 
TTGACTACGC 
GCAAGATTAA 
ACTTCCATGC 
GCATTGGAGG 
CAAACTTGTT 
GTGATCCAAA 
TCTCTGCTGA 
GCTTGCCAAA 
CTGTCACTAC 



AGTCCCCGAT 
TATCACCTAC 
AATCCGGAAA 



TTTTGATGGC 
CCTCACTGCT 



TGACATACGT 7 BO 



CTGGCATTGA 
ACAAATACTG 
CTTTTGGTTT 



CGTGGGAAAT 
ACCAAAGACC 
AGCTGCTTAT 
GTTAATTAGC 
TCCTAACTTT 



AGACAGATGA 
CCTAAAATTG 
AACCAATTTG 
TGGTTTGGTT 
AGTATTTATT 



TCAAAGCAAG 
TAAAATTG 



GTTGAAAATG 
GCATATTTGC 
CTGTAAACCA 
AATTGTCCAT 
ATAATTCTAT 
ATACTTACTT 



TTATCCCAAA 
CTACTCTAAA 
CCTACTCCAA 
GTGTAATTAA 



TAGGTAATGA 
TCTTGCTTGA 
TTGAAGCATG 
CTGGCATAAC 



1020 

1140 
1200 
1260 
1320 
13B0 

1500 
1560 

16B0 



45 
50 
55 
60 



1 11 21 31 41 51 

I I I I I I 

MKFLLILLLQ ATASGALELN SSTSLEKNNV LFGERYLEKF YGLEINKLPV T 

KEKIQEMQHF LGLKVTGQLD TSTLEMMHAP RCGVPDVHHF REMPGGPWJR KHYITYRINN 

YTPDMNREDU D YAI RKAFQ V WSNVTPLKFS KINTGMADIL WFARGAHGD FHAFDGKGGI 

LAHAFGPGSG IGGDAHFDED EFWTTHSGGT NLFLTAVHEI GHSLGLGHSS DPKAVMFPTY 

KYVDINTFRL SADDIRGIQS LYGDPKENQR LPNPDNSEPA LCDPNLSFDA VTTVGNKIFF 

FKDRFFWLKV SERPKTSVNL ISSLWPTLPS GIEAAYEIEA RNQVFLFKDD KYWLISNLRP 

EPNYPKSIHS FGFPNFVKKI DAAVFNPRFY RTYFFVDNQY WRYDERRQMM DPGYPKLITK 

NFQGIGPKID AVFYSKNKYY YFFQGSNQFE YDFLLQRITK TLKSNSWFGC 

Seq ID NO j 2 9 DNA sequence 

Nucleic Acid Accession ft : NM_00S115.1 

Coding sequence: 236.. 1765 



I 

GCTTCAGGGT 
CGGGACACCC 
ACTCTCTGAG 
GAGACCTAGA 
ACGAAGGCGT 
CCCACGGAGA 
TGCCGCCCTG 
CGGGAGACAC 



C CGCAGCCAGA 



TGGACTTGAT 
GGATTTACGG 
TCTGTACTCA 
TGGTTTGAGC 
CCTCAAGGAA 



AATCCAAGCG 
TTGTGGGGTT 
CTTGTGGAGC 
GAGTTGCTGC 
AGCCAGACCC 
CTGATGAAGG 
GTGCTCCTTG 
AAGAACTCTC 
TTTCCAGAGC 
ACAGAGGCAG 
GGTGCCTGTG 
CTACGCCTGT 



TTTGATTATT 
TTGGAGGTCC 
CCATTCAGAG 



AGCCGGGCCT 
CCTGTCAACA 
ACTCTCAGAC 



CCAGGGAGCT 
TGAAGGCAAT 
GACAACATCT 

ATCAGGACTT 



TACCTGGAAG CTACCCACCT T 
GCGTAGACTC C 
GCAGTATATC G 
TCTGGACTCT TTATTTTTCC T 
CCCCTTGGAA ACCCTCTCAA T 



GGTGCAGGCC 
TCACCTGGAG 
TCGCCCCAGG 
CTGGACTGTA 
TCAGCCCATG 
CATTCCAGTA 
CTCCTACCTC 
GCTGAAGATT 
GGACTCTATT 
TTCTCCTTAC 
ATCTTCCTAC 
CCTCAGTCTG 
CCTGGATCAG 
CCGGCTTTCG 
TGTCCIGAGT 



51 

I 

AGCACCGCTC 
GCAACTTCGC GGTGTGGTGA 
GTGCGTGGCA ACAAGTGACT 
CTAAGTCGCT TCAAAATGGA 
AGCATGAGTG TGTGGACAAG 
AAGGATGAGG CCCTGGCCAT 
CTCTTCATGG C 



ACCTTCAAAG CTGTGCTTGA 
AGGTGGARAC TTCAAGTGCT 
TGGTCTGGAA ACAGGGCCAG 
ACAAAGAAGC GAAAAGTAGA 
GAGGTGCTCG TAGACCTGTT 
ATTGAGAAAG TGAAGCGAAA 
TTTGCAATGC CCATGCAGGA 



CTGGGCCAGA TGATTAATCT 
ATTTCCCCGG AGAAGGAAGA 
CAGTGCCTGC AGGCTCTCTA 
TTGCTCAGGC ACGTGATGAA 
GAAGGGGATG T'GATGCATCT 
CTAAGTGGGG TCATGCTGAC 



CGATGTAAGT CCCGAGCCCC 
CCTGGTCTTT GAK3AGTGTG 
GAGCCACTGC TCCCAGCTTA 



TCCAAGCTCT 
GGATCACGGA 
CAACCTTAAG 



198 



WO 02/086443 

CTTGCAGAGT CTCCTGCAGC 
TGTCCCCCTG GAGAGTTATG 
TCTGCATGCC AGGCTCAGGG 
TAGTGCCAAC CCCTGTCCTC 
GTGCCCCTGT TTCATGCCTA 
TTGGACACTA AAGCCAGGAT 
ACAAATGTTC AGTGTGAGTG 
GTTCAGTGAG GAAAAAAAGG 
GTGATCTTTG GGGAGATACA 
GATTCTGGCT TGGGAAGTAC 
T AAAGAGAAGC 



1360 
i : E 0 



PCT/US02/12476 



AGGAAAACAT GTTCAGTGAG GAAAAAACAT TCAGACAAAT 
GGAAGTTGGG GATAGGCAGA TGTTGACTTG AGGAGTTAAT 
TCTTATAGAG TTAGAAATAG AATCTGAATT TCTAAAGGGA 
ATGTAGGAGT TAATCCCTGT GTAGACTGTT GTAAAGAAAC 
AAAAAAAAAA AAAAAAAA 



20 
25 



GCTTCAGGGT 

ACTCTCTGAG 
GAGACCTAGA 
ACGAAGGCGT 



I I 

" ACAGCTCCCC CGCAGCCAGA 
CACCCGCTTC CCAGGCGTGA 
GAAAAACCAT TTTGATTATT 
AATCCAAGCG TTGGAGGTCC 
TTGTGGGGTT CCATTGAGAG 
CTTGTGGAGC TGGCAGGGCA 



GCAGCGCCTC 
CCTGTCAACA GCAACTTCGC 



TGAGGCCAGC 
CCGATACATC 
GAGCCTGCTG 



GTGCGTGGCA 



2 AGCCAGACCC TGAAGGCAAT G< 

3 CTGATGAAGG GACAACATCT T 



AGCATGAGTG 
AAGGATGAGG 
CTCTTCATGG 



GGTGTGGTGA 
ACAAGTGACT 
TCAAAATGGA 



G ACCTTCAAAG C 



CCCTGGCCAT 
CAGCCTTTGA 
CCTGCCTCCC 



GGATTTACGG 
TCTGTACTCA 
TGGTTTGAGC 
CCTCAAGGAA 
GAAAAATGTA 



GTGCTCCTTG 
AAGAACTCTC 
TTTCCAGAGC 
ACAGAGGCAG 
GGTGCCTGTG 
CTACGCCTGT 
ATCCTGAAAA 



ATCAGGACTT 
CAGAAGCAGC 
AGCAGCCCTT 
ATGAATTGTT 
GCTGTAAGAA 



GCAGTATATC G 



TGGCGAAATT 
ACATCCATGC 
CCTCTCAGTT 
TTAGAGGCCG 
TAACTAACTG 
GTCAGCTAAG 



CTGGACTGTA 
TCAGCCCATG 
CATTCCAGTA 
CTCCTACCTC 
GCTGAAGATT 
GGACTCTATT 



AGGTGGAAAC 
TGGTCTGGAA 
ACAAAGAAGC 



ATTGAGAAAG 
TTTGCAATGC 
GAAGATTTGG 



ATCTTCCTAC 
CCTCAGTCTG 
CCTGGATCAG 
CCGGCTTTCG 



CAGTGCCTGC 



CCTGGTCTTT 
GAGCCACTGC 
CTTGCAGAGT 
TGTCCCCCTG 
TCTGCATGCC 



TCCCAGCTTA 



GGATCACGGA 



GTGCCCCTGT 
TTGGACACTA 
ACAAATGTTC 



GTGATCTTTG 
GATTCTGGCT 
TGTTGAAAAT 



GAGAGTTATG 
AGGCTCAGGG 
CCCTGTCCTC 
TTCATGCCTA 
AAGCCAGGAT 
AGTGTGAGTG 
GAAAAAAAGG 
GGGAGATACA 
TGGGAAGTAC 



ACCTCATCGG 
AGGACATCCA 
AGTTGCTGTG 



GCTGGAGAGA 
TGATCAGCTC 
CTTCTACGGG 
GC7GAGCAAT 



TTCAAGTGCT 
ACAGGGCCAG 
GAAAAGTAGA 
TAGACCTGTT 
TGAAGCGAAA 
CCATGCAGGA 
AAGTGACTTG 
TGATTAATCT 
AGAAGGAAGA 
AGGCTCTCTA 
ACGTGATGAA 
TGATGCATCT 
TCATGCTGAC 
CCCTCCAGGA 



CCATATCTGC 



AGCCCATCCT 



ACTAGCTGGG 



AGGAAAACAT 
GGAAGTTGGG 
TCTTATAGAG 



GTTCAGTGAG G 

TGTTGACTTG A 
AATCTGAATT TCTAAAGGGA 
TAATCCCTGT GTAGACTGTT GTAAAGAAAC 



AAAGAGAAGC AATGTGAAGC AAAAAAAAAA AAAAAAAA 



C GCTCTCGGCA C 



C GCCCGCGTTC 



I 



CCTTCTAAAC 
TCTGCAGACC 
T A CACAGCC A 
GACAAAAGGA 
TCGAAGACAA 
ATTCCTTGCT 



A CAAAATAATT 



GGGCTGTTGC 
AACAGACACA 
GACACACTAG 
CTATGCAAGA 
CAGCACAGAA 



GAAGCCTGCA 
GGCAGAGTTA 
GATTTCAGAG 
AAGAAAAGAT 
ACTGTGCTGC 



TCCTGGCCCT GCCCGGCATC 
CCGTCTGCCT GCATCTGCTG 
AAAAGGTGAT ACTTAATGTA 
ATTTGGAAGA GTGCTTCAGG 
TTCTAAATGA TGGGTCAGTG 
CATTTACCAT ATGGCTTTCT 
TAGAACATCA GAAGAAGGTA 
CCAAGAGGAG ATGGGCACCT 
CATTGTTTCT TCAACAAGTT 
TAAGTGGACG TGGAGTTGAT 
GAAATCTATT T 




TATGAAGCAT TTGTAGAGGA 
GATAAGGATT 
GAAAATGGAC 
GTAAAGCCAC 
GAAGCGCCAT TTGCTAGAGA 
GTTCATGTGA GGGATCTGGA 
AT T AAAG AAA ACTTAGCAGT 



TGCCACAGAC A 
GCAGACACCA A 
AGTCTCTCAT T 
AGACATGGAT G 
AGATTCAAAT G 
AAATGCATTC A 
TGCCAATTGG A 
CAGCACAGAC A 
AGAAAACCGT CAAGTGAACC TGGAAATTGG AGTAAACAAT 



2 AAAGAAACTA A 
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AATAGAAATG GCAATGGTTT 
AT TGATG AAA TTTCAGGGTC 
CCCAAAAATG AGTTGTATAA 
ACTGGAACAC TTGCTGTGAA 
GAATATGTAG TCATTTGCAA 
GATGAACCTG TCCATGGAGC 
AGTAGACTGT GGAGCCTCAC 
T TTCAAGAATA 



AAGGTACAAA AAATTGCATG ATCCTAAAGG TTGGATCACC 
AATCATAACT TCCAAAATCC TGGATAGGGA GGTTGAAACT 
TATTACAGTC CTGGCAATAG ACAAAGATGA TAGATCATGT 
CATTGAAGAT GTAAATGATA ATCCACCAGA AATACTTCAA 
GGGTATACCG ACATTTTAGC TGTTGATCCT 



C ATATCAGAAA 
A AAGACAGGGC CGGCCAAGCT 
A GTGTCGTGCG 



PCT/US02/12476 



CCTGGAGACG ATAGAGTGTG 
AGCCAAGGTT TTTGTGGTAC 
GAAATGATGA AAGGAGGAAA 
ACCCTGGACT CCTGCAGGGG 
GAGTGGCACA GTTTTACTCA 



TATGGGATCA 
AGGACACACG 



GGATTTATGA 
GGAATGAAAA 
GAATCCTGCC 



CCCAAACTAC 



GGAAACCATT 



CCAGCTGGTT 
AATAATTTGG 
AGTGCTACAA 
TTCAATTTCA 
CCAATTTATA 
AACAGACAAC 



ATAGCTAAGT 
ATAAACAAGA 
ACTGAATTAA 
ACCAAATTCA 



TCTATAGGAA T. 



TGGTAAATCT 
TTTTACGGAT 
TATGCTAATA 
AATATTGAGT 
ATTAAAAATG 
TTTGACTTTG 



ATATGATGAT 
CAGTTGTTGC 
CAAACTCCAG 
ATTTTAGTJ 
TCACATTATT 
ATCACTATGT 
TTGCAGCTCA 
GAGGCAAAAT 



C GGTGAAAAAT 
C CTCACTTATA 
A AAGCAGGAAG 
A GCAGAAGCAT 
T CTGGAGGTTT 



GTAAATAAAT 
TAGCTTTGCT 
GAATACTCGC 



AAATGAGAAC 
TACAATAGAA 
AGTCTATGAG 



ATGTATTCAC 
GAAGAAAGTT 
TAAAGAATTG 
GTGTTGAAGT 
AATAAATGTG TGTGTGTATA 



TTGCAGTCTG 



CTCTATTGCT 
TAACCATGTC 
GCACCCTGGG 
GTCTGGGAGC 



GGAGCTAATA 
GTTTCTATTC 
CTCCTAGAGT 



AAAGAGGAAA 
AAAAGAGAGA 
GAAATAGTTC 
TGGTTTCTGT 
TTTCAAGATT 
GTTCCCTGCT 
ACAAAAACAT 
TCTCTTATAG 



ATGGTAAAAA 
GCTTCCTAGG 
CTGTCCAATT 
GGGAAGGAAA 
TCTGCATCCA 
TTTTGGTAGC 



CACTTACTCG 
TGCA7CGATG TAATCAGAAT 
ACTATGAGGG AAGAGGATCT 



ATAATGTCAC 
TATTGTAAAG 
ATGCTACTCA 
AAAATGTTAA 
AAGCATCTGC 
TAGTCCAACA 
AGTTTAAAAA 
AACAATGAAG 
CTACTGCACT 
GTAGCAATTT 
TCAATGCAAT 



GCACAAAGAG 
CCAAAAATAA 
ATTTTGAATT 
CAAAAAGTGA 
AAGGTCTCTA 



GCCCTATGAA 
TTATTATTAA 
CTTGAAATGA 
CCTGGGCTCT 
TGTGTAATTT 
TAGGGAATCC 



TAAATGCTGC 



AAGGGTCCAG 



GTCCTTAAAC 



TGACCAACAT 
GAGGGAGCTG 
CTAAGCCCCA 



ACCTCCAGCA 



TTGAATGTAT 



GGGCAAGGAG 



AGTAGGT TAT 



h GGCTTGGCAC 



CCTTGTGGGC 
TAAGTGACTC 
TCTCCAGAGA 
GATCAAGTTG 
TTGTACAGTC 
GAATATGGGT 



ACCTCTTCTC 



AGAGGG CAAC 
GGGAGTAAAA 
TTTCTCAGGC 
TATGGCTCAC 



3120 

3240 

3360 
342 0 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 

CSGCTTTCTG CTAAAGCAAC 40 8 0 
ACCATCCTTC AGCGTGAATT 414 0 
TAATAGAAGA AATAGAAATT 4200 
ACAGAGGGAA CTTTGGGAGA 42 6 0 
AGGAAGATGC AGGCCTTCAA 432 0 
GCAACATCGT CTGCTTCATA 4380 
CAATGGCAAC 



AGGGGAGGAT 



AATGGAACAG 
GCAAACTGGG 
AGATGAGGTG 
AAGTTAAATC 
TAGATCCAAA 



3 CATTCATGGG 



GCCTATCTAA 



TTCCTTCTGC A' 



GCCTCCTGAG 
TTTAATAGAG 
ATCCGCCTGC 
CTTGTTTTCC 
TGATCATACG 
GGGAGAAAGA 
TTGCTGAAAT 
ACTGTGTTTT 



TAACCATCTC 
TTTGTAATTC 
CATATGTAGT 
CAAGAAAATA 
TGTAAATATA 
GGGGTTTGTT 
TGCTTTTAAA 
TACAGATGTG 
TAGAGATTAA 
AATAGAAATA 



CTGCTCACTG 
TAGCTGGGAC 
ACGGGGTTTC 
CTCGGCCTCC 
GTTTAAAGTC 
AATTGGATCA 
ACTCAGGGCA 
TTCCTGCTGT 
GCTCACTCCC 
TAAACTTTCT 
TTTGTTCTTT 



3 AGACGGAGTC 



TACAGGCGCC 



ATTATTATTT 
TATTTTTAAA 
CAGAATGTTT 
TTGCAATGTT 
GAAACTTGGC 



CAAAGTGCTG 
GTCTTCTTTT 
ATCTTGAAAT 
CAAAATATTG 
AACCAGAAGC 
TCACTCACCG 
CAAAGAGCAA 
GAACATGCTG 
AATGAAAATT 
CCTTATATGT 
GCTTTCATTT 



TCGCTCTGAC 
CTCCCGGGTT 
CACCACCACG 
CCAGGATGGT 
GGATTACAGG 
AATGTAATCA 
ACTCAACCAA 
GTCTGAGAAT 
CAGTTTTATC 
ATCAAAACCT 
CCAGTATCAC 
AAAACCACCT 
TAATTTTAGG 
GTAAGGTGAA 
TTCCCCCAGT 
TTATAAGGAA 
TTTTAGTATT 
AAGCAAAAAT 



AGACTTAGAC 



TATTCCTACA 



CATGCCATTC T 



CTCGATCTCC 
CATGACCCAC 
TTTTGAACAT 
AAGACAGTCG 
GGAATTCTCT 
TAACGGCTAC 
GCTACCTCCC 
TTCCCTGTTT 
GGTCTGCATG 



GAATGATTTA 
GCAGCTGTCT 
GCTATTAAAA 
TGGATGCATA 



GTAAGCCTAG 
TGAAACACCC 
CAAGACTTTA 
ATAAAACCTC 
TATGCCCGAA 

TTTGAGTGTG 
GAATTTTTTA 
AAAATGCAGT 



CTCAATTATG TGT 



ATTAAAAGTA TTAGAAGGTG 
GGCAATATTG CAGTCTTGAT 
AGTGTGCTCC 
CATTATTTTT GTGTATGTCT 



AGTATCTATG 
GTTATAATTG CAGAGTATTC 
CAGTGTAGTC 



ATGCTTATGG 
AATATTTTGG 
TGGGAAGAGA 



ATGAGTTAAA 



TGAGAAGCAT GGACACTAGA 
TTCTGTGTGA CCTTTGAAAG 
GAACAATGCC AGCCTCATGG 



5400 
5460 

5580 



ACAATGTTTC 
TCACTATTTT 
TTGATCGGGT 
ACACTGACAC 
AAGAAAAGCA 



GGTGCTCTGT GCTTCACAGT GAATCTTTTC C 
TTAAGACTGA TCATTTCAAA AATCTATTAG C 
GTTGAACCAA AATTTCAATT CCAGTAACTT CTATTGTAAC 
T CAAGAATGT TCATTGGATT TTTGTTTGTA ATAGTAAAAT 
CCTTCAGTAT TGATTTGGTT GAATATTGGG TCATAATGGT 
GCCAGAATGC TTGGATATGA ATCCTGGATC TGTCACTTAC 
GCTACTTATT TCCTCTCTTA GCTTTCTCAT TAAAATCAAT 
GGTTGTTGAA TGATTAAATT AGTTAATATA CCTAAAGTAC 
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ATAGAACACT GCCTGCACAT AGTAAAAGAA TTATAAGTGT GAGGTAOTTG GTAAAATTAT S84 0 

GTAGTTGGAT ATACTACCGA ACAATATCTA ATCTCTTTTT AGGGAAATAA AGTTTGTGCA 6900 
TATATATAAT CCCGAAACAT G 



1 11 21 31 41 51 

I I I I I I 

MAAAGPRRSV RGAVCLHLLL TLVIFSRDGE ACKKVILNVP SKLEADKIIG RVNLEECFRS 

ADLIRSSDPD FRVLNDGSVY TARAVALSDK KRSFTIWLSD KRKQTQKEV? VLLEHQKKVS 

KTRHTRETVL RRAKRRWAPI PCSMQENSLG PFPLFLQQVE SDAAQNYTVF YSISGRGVDK 

EPLNLFYIER DTGNLFCTRP VDREEYDVFD LIAYASTADG YSADLPLPLP IRVEDENDNH 

PVFTEAIYNF EVLESSRPGT TVGWCATDR DEPDTMHTRL KYSILQOTPR SPGLFSVHPS 

TGVITTVSHY LDREWDKYS LIMKVQDMDG QFFGLIGTST CIITVTDSND NAPTFRQNAY 
EAFVEENAFN VEILRIPIED KDLINTANWR VNFTILKGNE NGHPKISTDK ETHEGVLSW 

KPLNYEENRQ VNLEIGVHNE APFARDIPRV TALNRALVTV HVRDLDSGPE CTPAAQYVRI 

KENLAVGSKI NGYKAYDPEN RNGNGLRYKK LHDPKGWITI DEISGSIITS KILDREVETP 

KNELYNITVL AIDKDDRSCT GTLAVNIEDV NDNPPEILQE YWICKPKMG YTDILAVDPD 

EPVHGAPFYF SLPNTSPEIS RLWSLTKVND TAARLSYQKN AGFQEYTIPI TVKDRAGQAA 

TKLLRVNLCE CTHPTQCRAT SRSTGVILGK WAILAILLGI ALLFSVLLTL VCGVFGATKG 

KRFPEDLAQQ NLI ISNTEAP GDDRVCSANG FMTQTTNNSS QGFCGTMGSG MKNGGQETIE 

MMKGGNQTLE SCRGAGHHHT LDSCRGGHTE VDNCRYTYSE WHSFTQPRLG EKLHRCNQNE 

DRMPSQDYVL TYNYEGRGSP AGSVGCCSEK QEEDGLDFLN NLEPKFITLA EACTKR 

Seq ID NO: 33 DNA sequence 
Nucleic Acid Accession th Eos sequence 
Coding sequences 64-2 583 

1 11 21 31 41 51 

I I I I I I 

GGCAGGTCTC GCTCTCGGCA CCCTCCCGGC GCCCGCGTTC TCCTGGCCCT GCCCGGCATC 
CCGATGGCCG CCGCTGGGCr r T( GCGGAG CCGTCTGCCT GCATCTGCTG 

CTGACCCTCG TGATCTTCAG TCGTGATGGT GAAGCCTGCA AAAAGGTGAT ACTTAATGTA 
CCTTCTAAAC TAGAGGCAGA CAAAATAATT GGCAGAGTTA ATTTGGAAGA GTGCTTCAGG 
TCTGCAGACC TCATCCGGTC AAGTGATCCT GATTTCAGAG TTCTAAATGA TGGGTCAGTG 
TACACAGCCA GGGCTGTTGC GCTGTCTGAT AAGAAAAGAT CATTTACCAT ATGGCTTTCT 
GACAAAAGGA AACAGACACA GAAAGAGGTT ACTGTGCTGC TAGAACATCA GAAGAAGGTA 
TCGAAGACAA GACACACTAG AGAAACTGTT CTCAGGCGTG CCAAGAGGAG ATGGGCACCT 
ATTCCTTGCT CTATGCAAGA GAATTCCTTG GGCCCTTTCC CATTGTTTCT TCAACAAGTT 
GAATCTGATG CAGCACAGAA CTATACTGTC TTCTACTCAA TAAGTGGACG TGGAGTTGAT 
AAAGAACCTT TAAATTTGTT TTATATAGAA AGAGACACTG GAAATCTATT TTGCACTCGG 
CCTGTGGATC GTGAAGAATA TGATGTTTTT GATTTGATTG CTTATGCGTC AACTGCAGAT 
GGATATTCAG CAGATCTGCC CCTCCCACTA CCCATCAGGG TAGAGGATGA AAATGACAAC 
T TCACAGAAGC AATTTATAAT TTTGAAGTTT TGGAAAGTAG T 

TGGTTTG TGCCACAGAC AGAGATC 
A GCATTTTGCA GCAGACACCA AGGTCACCTG G 
AGCACAGGCG TAATCACCAC AGTCTCTCAT TATTTGGACA GAGAGGTTGT A 
TCATTGATAA TGAAAGTACA AGACATGGAT GGCCAGTIT 1 
ACTTGTATCA TAACAGTAAC AGATTCAAAT GATAATGCAC CCACTTTCAG ACAAAATGCT 114 0 
TATGAAGCAT TTGTAGAGGA AAATGCATTC AATGTGGAAA TCTTACGAAT ACCTATAGAA 12 00 
GATAAGGATT TAATTAACAC TGCCAATTGG AGAGTCAATT TTACCATTTT AAAGGGAAAT 12 60 
GAAAATGGAC ATTTCAAAAT CAGCACAGAC AAAGAAACIA ATGAAGGTGT TCTTTCTGTT 132 0 
GTAAAGCCAC TGAATTATGA AGAAAACCGT CAAGTGAACC TGGAAATTGG AGTAAACAAT 1380 
GAAGCGCCAT TTGCTAGAGA TATTCCCAGA GTGACAGCCT TGAACAGAGC CTTGGTTACA 144 0 
GTTCATGTGA GGGATCTGGA TGAGGGGCCT GAATGCACTC CTGCAGCCCA ATATGTGCGG 1500 
ATTAAAGAAA ACTTAGCAGT GGGGT CAAAG AT CAACGGCT ATAAGGCATA TGACCCCGAA 1560 
AATAGAAATG GCAATGGTTT AAGGTACAAA AAATTGCATG ATCCTAAAGG TTGGATCACC 162 0 
ATTGATGAAA TTTCAGGGTC AATCATAACT TCCAAAATCC TGGATAGGGA GGTTGAAACT 1680 
CCCAAAAATG AGTTGTATAA TATTACAGTC CTGGCAATAG ACAAAGATGA TAGATCATGT 1740 
ACTGGAACAC TTGCTGTGAA CATTGAAGAT GTAAATGATA ATCCACCAGA AATACTTCAA 1800 
GAATATGTAG TCATTTGCAA ACCAAAAATG GGGTATACCG ACATTTTAGC TGTTGATCCT 1860 
GATGAACCTG TCCATGGAGC TCCATTTTAT TTCAGTTTGC CCAATACTTC TCCAGAAATC 192 0 
AGTAGACTGT GGAGCCTCAC CAAAGTTAAT GATACAGCTG CCCGTCTTTC A7ATCAGAAA 1980 
AATGCTGGAT TTCAAGAATA TACCATTCCT AT T ACTGTAA AAGACAGGGC CGGCCAAGCT 2040 
GCAACAAAAT TATTGAGAGT TAATCTGTGT GAATGTACTC ATCCAACTCA GTGTCGTGCG 2100 
ACTTCAAGGA GTACAGGAGT AATACTTGGA AAATGGGCAA TCCTTGCAAT ATTACTGGGT 2160 
ATAGCACTGC TCTTTTCTGT ATTGCTAACT TTAGTATGTG GAGTTTTTGG TGCAACTAAA 222 0 
GGGAAACGTT TTCCTGAAGA TTTAGCACAG CAAAACTTAA TTATATCAAA CACAGAAGCA 22 80 
CCTGGAGACG ATAGAGTGTG CTCTGCCAAT GGATTTATGA CCCAAACTAC CAACAACTCT 234 0 
AGCCAAGGTT TTTGTGGTAC TATGGGATCA GGAATGAAAA ATGGAGGGCA GGAAACCATT 24 0 0 
GAAATGATGA AAGGAGGAAA CCAGACCTTG GAATCCTGCC GGGGGGCTGG GCATCATCAT 24 60 
ACCCTGGACT CCTGCAGGGG AGGACACACG GAGGTGGACA ACTGCAGATA CACTTACTCG 252 0 
GAGTGGCACA GTTTTACTCA ACCCCGTCTC GGTGAAGAAT CCATTAGAGG ACACACTGGT 2580 
TAAAAATTAA ACATAAAAGA AATTGCATCG ATGTAATCAG AATGAAGACC GCATGCCATC 264 0 
CCAAGATTAT GTCCTCACTT ATAACTATGA GGGAAGAGGA TCTCCAGC7G GTTCTGTGGG 2700 
CTGCTGCAGT GAAAAGCAGG AAGAAGATGG CCTTGACTTT TTAAATAATT TGGAACCCAA 2760 
ATTTATTACA TTAGCAGAAG CATGCACAAA GAGATAATGT CACAGTC-CTA CAATTAGGTC 282 0 
TTTGTCAGAC ATTCTGGAGG TTTCCAAAAA TAATATTGTA AAGTTCAA7T TCAACA7GTA 2880 
TGTATATGAT GATTTTTTTC TCAATTTTGA ATTATGCTAC TCACCAATTT ATATTTTTAA 2 94 0 
AGCCAGTTGT TGCTTATCTT TTCCAAAAAG TGAAAAATGT TAAAACAGAC AACTGGTAAA 300 0 
TCTCAAACTC CAGCACTGGA ATTAAGGTCT CTAAAGCATC TGC7CTTTTT TTTTTTTACG 3060 
GATATTTTAG TAATAAATAT GCTGGATAAA TATTAGTCCA ACAATAGCTA AGTTATGCTA 3120 
ATATCACATT ATTATGTATT CACTTTAAGT GATAGTTTAA AAAATAAACA AGAAATATTG 3180 
AGT AT CACTA TGTGAAGAAA GTTTTGGAAA AGAAACAATG AAGACTGAAT TAAATTAAAA 3240 
ATGTTGCAGC TCATAAAGAA TTGGGACTCA CCCCTACTGC ACTACCAAAT TCATTTGACT 3300 
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TTGGAGGCAA AATGTGTTGA AGTGCCCTAT GAAGTAGCAA TTTTCTATAG C-AATATAGTT 3360 

GGAAATAAAT GTGTGTGTGT ATATTATTAT T AAT CAATGC AATATTTAAA ATGAAATGAG 342 0 

AACAAAGAGG AAAATGGTAA AAACTTGAAA TGAGGCTGGC- GTATAGTTTG TCCTACAATA 3480 

GAAAAAAGAG AGAGCTTCCT AGGCCTGGGC TCTTAAATGC TGCATTATAA CTGAGTCTAT 354 0 

GAGGAAATAG TTCCTGTCCA ATTTGTGTAA TTTGTT7AAA ATTGTAAATA AATTAAACTT 3600 

TTCTGGTTTC TGTGGGAAGG AAATAGGGAA TCCAATGGAA CAGTAGCTTT GCTTTGCAGT 3660 

CTGTTTCAAG ATTTCTGCAT CCACAAGTTA GTAGCAAACT GGGGAATACT CGCTGCAGCT 372 0 

GGGGTTCCCT GCTTTTTGGT AGCAAGGGTC CAGAGA7GAG GTGTTTTTTT CGGGGAGCTA 3780 

ATAACAAAAA CATTTTAAAA CTTACCTTTA CTGAAGTTAA ATCC7CIATT GCTGTTTCTA 384 0 

TTCTCTCTTA TAGTGACCAA CATCTTTTTA ATTTAGATCC AAATAACCAT GTCCTCCTAG 3900 

AGTTTAGAGG CTAGAGGGAG CTGAGGGGAG GATCTTACTG AAAGCACCCT GGGGAGATTG 3960 

ATTGTCCTTA AACCTAAGCC CCACAAACTT GACACCTGAT CAGGTCTGGG AGCTACAAAA 402 0 

TTTCATTTTT CTCCTCACTG CCCTTCTTCT GAGTGGCATT GGCCTGAATC AAGGAAAGCC 4080 

AGGCCTTGTG GGCCCCCTTC TTTCGGCTTT CTGCTAAAGC AACACCTCCA GCAGAGATTC 414 0 

CCTTAAGTGA CTCCAGGTTT TCCACCATCC TTCAGCGTGA ATTAATTTTT AATCAGTTTG 4200 

CTTTCTCCAG AGAAATTTTA AAATAATAGA AGAAATAGAA ATTTTGAATG TATAAAAGAA 4260 

AAAGATCAAG TTGTCATTTT AGAACAGAGG GAACTTTGGG AGAAAGCAGC CCAAGTAGGT 432 0 

TATTTGTACA GTCAGAGGGC AACAGGAAGA TGCAGGCCTT CAAGGGCAAG GAGAGGCCAC 4380 



CACTGCCTTT TCCTTTCTCA GGCCAATGGC AACTGCCATT 7GAGTCCGGT GAGGGATCAG 4500 

CCAACCTCTT CTCTATGGCT CACCTTATTT GGAGTGAGAA ATCAAGGAGA CAGAGCTGAC 4560 

TGCATGATGA GTCTGAAGGC ATTTGCAGGA TGAGCCTGAA CTC-GTTGTGC AGAACAAACA 4620 

AGGCATTCAT GGGAATTGTT GTATTCCTTC TGCAGCCCTC CTTC7GGGCA CTAAGAAGGT 4680 

CTATGAATTA AATGCCTATC TAAAATTCTG ATTTATTCCT ACATTTTCTG TTTTCTAATT 474 0 
TGACCCTAAA ATCTATGTGT TTTAGACTTA GACTTTTTAT T 
TTGAGACGGA GTCTCGCTCT GACGCACAGG CTGGAGTGCA GTGGCTCCGA T 

CTGAAAGCTC CGCCTCCCGG GTTCATGCCA TTCTCCTGCC TCAGCCTCCT GAGTAGCTGG 492 0 

GACTACAGGC GCCCACCACC ACGCCCGGCT AATTTTTTGT ATTTTTAATA GAGACGGGGT 4980 

TTCACTGTGT TAGCCAGGAT GGTCTCGATC TCCTGACCTC GTGATCCGCC TGCCTCGGCC 504 0 

TCCCAAAGTG CTGGGATTAC AGGCATGACC CACCGCTCCC GGCCTTGTTT TCCGTTTAAA 5100 

GTCGTCTTCT TTTAATGTAA TCATTTTGAA CATGTGTGAA AGTTGATCAT ACGAATTGGA 5160 

TCAATCTTGA AATACTCAAC CAAAAGACAG TCGAGAAGCC AGGGGGAGAA AGAACTCAGG 522 0 

GCACAAAATA TTGGTCTGAG AATGGAATTC TCTGTAAGCC TAGTTGCTGA AATTTCCTGC 52 80 

TGTAACCAGA AGCCAGTTTT ATCTAACGGC TACTGAAACA CCCACTGTGT TTTGCTCACT 534 0 

CCCACTCACC GATCAAAACC TGCTACCTCC CCAAGACTTT ACTAGTGCCG ATAAACTTTC 54 0 0 

TCAAAGAGCA ACCAGTATCA CTTCCCTGTT TATAAAACCT CTAACCATCT CTTTGTTCTT 54 60 

TGAACATGCT GAAAACCACC TGGTCTGCAT GTATGCCCGA ATTTGTAATT CTTTTCTCTC 5520 

AAATGAAAAT TTAATTTTAG GGATTCATTT CTATATTTTC ACATATGTAG TATTATTATT 5580 

TCCTTATATG TGTAAGGTGA AATTTATGGT ATTTGAGTGT GCAAGAAAAT ATATTTTTAA 564 0 

AGCTTTCATT TTTCCCCCAG TGAATGATTT AGAATTTTTT ATGTAAATAT ACAGAATGTT 5700 

TTTTCTTACT TTTATAAGGA AGCAGCTGTC TAAAATGCAG TGGGGTTTGT TTTGCAATGT 5760 

TTTAAACAGA GTTTTAGTAT TGCTATTAAA AGAAGTTACT TTGCTTTTAA AGAAACTTGG 5820 
CTGCTTAAAA TAAGCAAAAA TTGGATGCAT AAAGTAATAT TTACAGATGT G~ 
AATAAAACAA TATTAACTTG GCTGCTTAAA ATAAGCAAAA A 

TTTACAGATG TGGGGAGATG TAATAAAACA ATATTAACTT GGTTTCTTGT TTTTGCTGTA 60 0 0 

TTTAGAGATT AAATAATTCT AAGATGATCA CTTTGCAAAA TTATGCTTAT GGCTGGCATG 60 6 0 

GAAATAGAAA TACTCAATTA TGTCTTTGTT GTATTAATGG GGAATATTTT GGACAATGTT 6120 

TCATTATCAA ATTGTCGACA TCATTAATAT ATATTGTAAT GTTGGGAAGA GATCACTATT 618 0 

TTGAAGCACA GCTTTACAGA TGAGTATCTA TGATACATAT GTATAATAAA TTTTGATCGG 6240 

GTATTAAAAG TATTAGAAGG TGGTTATAAT TGCAGAGTAT TCCATGAATA GTACACTGAC 6300 
ACAGGGGTTT TACTTTGAGG ACCAGTGTAG TCAAGGGAAA A 
CAGGCAATAT TGCAGTCTTG AT1CTGCCAC TTACAGGATA G 

GACAAGATGA TCCAACCATA AAGGTGCTCT GTGCTTCACA GTGAATCTTT TCCCCATGCA 
GGAGTGTGCT CCCCTACAAA CGTTAAGACT G 

AAAGCCTTAC ATTTTAATAT AGGTTGAACC AAAATTTCAA TTCCAGTAAC T7C 
ACCATTATTT TTGTGTA 

ATACCGGATA CATTTCACGT GTCCTTCAGT ATTGATTTGG TTGAATATTG G 

GTTGAGAAGC ATGGACACTA GAGCCAGAAT GCTTGGATAT GAATCCTGGA TCTGTCACTT 6780 

ACTTCTGTGT GACCTTTGAA AGGCTACTTA TTTCCTCTCT TAGCTTTCTC A7TAAAATCA 684 0 

ATGAACAATG CCAGCCTCAT GGGGTTGTTG AATGATTAAA TTAGTTAATA TACCTAAAGT 6900 

ACATAGAACA CTGCCTGCAC ATAGTAAAAG AATTATAAGT GTGAGGTAGT TGGTAAAATT S960 

ATGTAGTTGG ATATACTACC GAACAATATC TAATCTCTTT TTAGGGAAAT AAAGTTTGTG 702 0 
CATATATATA ATCCGGAAAC ATG 

Seq ID NO: 34 Proteir 
Protein Accession #: 



MAAAGPRRSV RGAVCLHLLL TLVIFSRDGE ACKKVILNVP SKLEADKIIG RVNLEECFRS 
ADLIRSSDPD FRVUJDGSVY TARAVALSDK KRSFTIWLSD KRKQTQKEVT VLLEKQKKVS 
L RRAKRRWAPI PCSMQENSLG PFPLFLQQVE SDAAQNYTVF YSISGRGVDK 



EPVHGAPFYF SLPNTSPEIS RLWSLTKVND TAARLSYQKN AGFQEYTIPI TVKDRAGQAA 

TKLLRVNLCE CTHPTQCRAT SRSTGVILGK WAILAILLGI ALLFSVLLTL VCGVFGATKG 

KRFPEDLAQQ NLI ISNTEAP GDDRVCSANG FMTQTTNNSS QGFCGTMGSG MKNGGQETIE 

MMKGGNQTLE SCRGAGHHHT LDSCRGGHTE VDNCRYTYSE WHSFTQPRLG EE3IRGHTG 

Seq ID NO: 35 DNA sequence 

Nucleic Acid Accession #s Eos sequence 

Coding sequence: 146-1273- 



202 



WO 02/086443 



PCT/US02/12476 



C GTGGCGGTGC TGCCCAGGTG AGCCACCGCT GCTTCTGCCC AGACACGGTC 
GCCTCCACAT CCAGGTCTTT GTGCTCCTCG CTTGCCTGTT CC7TTTCCAC GCATTTTCCA 
GGATAACTGT GACTCCAGGC CCGCAATGGA TGCCCTGCAA CTAGCAAATT CGGCTTTTGC 
CGTTGATCTG TTCAAACAAC TATGTGAAAA GGAGCCACTG GGCAATGTCC TCTTCTCTCC 
AATCTGTCTC TCCACCTCTC TGTCACTTGC TCAAGTGC-GT GCTAAAGGTG ACACTGCAAA 



CAACAACTCA 
GAAGAAATTT 



CAGACCAAAA 



AACTTAGTTC CTTTTACTCA CTGAAACTAA TCAAGCGGCT 
ATCTTTCTAC AGAGTTCATC AGCTCTACGA A 
TTGACTTCAA AGATAAP 
TCACAGATGG CCACTTTGAG A 
TCCTTGTGGT TAATGCTGCC TACTTTGTTC- G 



AGAGTCACTG 
CATTCCAAAA 
GCTGAAACAT 



CAGATGATGA 
CAATTGTAAG ATCATAGAGC 
GTGGAGGATG 
TCACAGTGGA 
TTTAAGGTGG 
ATCTTCAGTG 
TCAAATGTTA 
TTCCATAGAG GTGCCAGGAG 



CTGTTCTCCT 
ATTTCTGTAA 
CTGGATCAGG 
TTTTTCCAAT 
TGAAAAGGAA 
AGTTGGCAGA 
TCTTCCCAGC 
CCTGAAAGAC 
AACTTGGGCA 
GCAGGTGTTT 
CTGTATGTTA 



ACATGGAGGC 
TTCCTTTTCA 
AGTCCACAGG 
CTAATCCCAG 
AAAAGATGAT 



CACGGATCCT 
GGCACAACAA 
AGCCCATGTT 



AAGCCGCCAG 
TCTATCTTTT 
TCACGTTAGA 
AATACAGTCT 
ACTATGCTTT 
TGAAGAAAGT 



TACTTGTCAT 
GTTTCCTTTT 
GGAAAAATAT 
TCCACAAAGA 
CCTTCTTTGG 
GTAGTGCATG 



CACGTTCTGT 
AAATAAGCAT 
CTTGGAGAAG 
CACCATGGCC 
TGATCCCAAG 
TGATTTCTCT 
GTGCTTAGAA 
GCAGCACAAG 
AACTCGAAAC 
AAGTCCTCCC 
TTTTCTAGAT 
ATGTAGCCTT 
TTCCCATAAG 
TTATTCATTA 
AAATTCCTAT 
GATAGAGAAT 



ATGGGAAACA - 



ATTGAAAAAC 
GCTTGTCTGG 



rCATCCTACT 840 



ATAACTGAAG 
GATGAATTGA 
ATCATTTTCT 



CCATAAGGGG 



TGAATTTTGG 
TGGATCAGAT 
AACAAAATGT AGAATATTCA 



TCCAGAAGTC 



CTGGGGCAGC 
GACAAAATGG 
CGTATGCCAC 
GGATAAGGAA 
CTTAATATAA 



TT7GTCAAAT 
AAGGAAGATT 
GTTCCAGACA 
AACTGCCCTG 
CTTATGTTAA 
AAGATAATAT 
AACCTATAAA 
ATACATAAAG 



AGACCAAGGG 
ATGGTGGGGA 
ATGCTGACCA 
TTGGCAAATT 
GTGGATGCCG 
GCTAATGTTG 
GACCTTTTTT 
ACGCTTTTAA 



GCACAGGGAT TCTCACAATA 
TCTAATATGA TAGCGGGAAA 
GATTAAAGTG CTCACGTTAC 
AGATGGCAAG CATGTAACTT 
CCTGTTGCCG GTTCATGGAT 
TGACATTCCT TCTCCCATCT 
AGATTCAATA TTGAATTTCT 

Seq I 



GCCGATATCA 
AGGAGAGGAA 
CTTGACACAT 
ATATTAATAG 
TACTTCTCTA 
CTTCCTTGAC 
CCTATGCTAT 



GAATTTGTGT TGAAGGAACT 
ACTACTGCCT TTAGAAAATA 
AGTTTTTCAG TCTATGGGTT 
TAATTTGTAA AGTTGGGTGG 
TAAAAAATAT ATATTTACCA 
ATGCATTGTA AATAGGTTCT 
TGACAATAAA ATATTATTGA 



TGGAAGCTCT 
TTCTCGCTTC 
GCTCCAGTGA 
GCCCTGGCAG 
TTTACATACA 
TCAACACCTT 
ACTAAGTAGC 
AAACACTTCG 
CTAGTAGCTG 

TAGCTGACTC 
TGTCTCTTCA 
TAAGTAAAGT 
TAGTTACTTT 



AAAAATTTTG 
TCTTGTTCTG 
ACTACC 



HDALQLANSA FAVDLFKQLC EKEPLGNVLF SPICLSTSLS LAQVGAKGDT ANEIGQVLHF 
ENVKDIPFGF QTVTSDVNKL SSFYSLKLIK RLYVDKSLNL STEFISSTKR PYAKELETVD 
FKDKLEETKG QINNSIKDLT DGHFENILAD NSVNDQTKIL WNAAYFVGK WMKKFPESET 
KECPFRLNKT DTKPVQMMNM EATFCMGNID SINCKIIELP FQNKHLSMFI LLPKDVEDES 
TGLEKIEKQL NSESLSQWTN PSTMANAKVK LSIPKFKVEK MIDPKACLEK LGLKHIFSED 
TSDFSGMSET KGVALSNVIH KVCLEITEDG GDSIEVPGAR ILQHKDEUJA DHPFIYIIRH 
NKTRNIIFFG KFCSP 



Seq ID NOs 37 DNA sequence 

Nucleic Acid Accession )| ; NM_0168583 

Coding sequence: 72-842 



GGAGTGGGGG AGAGAGAGGA 
TAAGAGCAAA 
CCATGGCCCA 
ATCCAGCCCT 
ATGGCCTGCT 
TGAAGCCTGG AGGAGGTACT 
TGGCCTGAAC 
TGTGCAGAGC 
AGTGAATACG 
TGCAGAAATC 
CACCCATTCC 
TCAAGGTCTT 



I 



I 



GACCAGGACA GC1GCTGAGA CCTCTAAGAA GTCCAGATAC 
ACTGGGGGCC TCATTGTCTT CTACGGGCTG 7TAGCCCAGA 
CTGCCCGTGC CCCTGGACCA GACCCTGCCC T 
CCCACAGGTC TTGCAGGAAG CTTGACT 
CTGTTGGGCA TTCTGGAAAA CCTTCCGCTC CTGGACATCC 
TCTGGTGGCC TCCT7GGGGG ACTGCTTGGA AAAGTGACGT 



CCCTCCCCAT 
AGTTGGTTCA 
CCCTGGTGCA 
AAGCCTTCCA 
GCCCATOTGC 
TCCCACCAGG 
AAAAAAAAAA 



TGACATTGTT 
GGAAGGGGCT 
TGGAAGATGA 
CGTGTGTAAC 
AAAAAAAAAA 



CCTGATGGCC ACCGTCTC7A TGTCACCATC CCTCTCGGCA 
CCCCTGGTCG GTGCAAGTCT C-T7GAGGCTG GCTGTGAAGC 
TTAGCTGTGA GAGATAAGCA GGAGAGGATC CACCTGGTCC 
CCTGGAAGCC TGCAAATTTC TCTGCTTGAT GGACTTGGCC 
CTGGACAGCC TCACAGGGAT CTTGAA7AAA GTCCTGCCTG 
TGCCCTCTGG TCAATGAGGT TCTCAGAGGC TTGGACATCA 
AACATGCTGA TCCACGGACT ACAGTTTGTC ATCAAGGTCT 
GGCCTCTGCT GAGC7GCT7C CCAGTGCTCA CAGATGGCTG 
TCTCTCCGA GGAACCTGCC CCCTCTCCTT 
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I I I I I I 

MFQTGGLIVF YGLLAQTMAQ FGGLPVPLDQ TLPLNVNPAL PLSPTGLAGS LTNALSNGLL 

SGGLLGILEN LPLLDILKPG GGTSGGLLGG LLGKVTSVIP GLNNIIDIKV TDPQLLELGL 

VQSPDGHRLY VTIPLGIKLQ VNTPLVGASL LRLAVKLDIT AEILAVRDKQ ERIHLVLGDC 

THSPGSLQIS LLDGLGPLP I QGLLDSLTGI LNKVLPELVQ GNVCPLVNEV LRGLDITLVH 
DIVNMLIHGL QFVIKV 



I I I I I I 

CTCAGGGCAG AGGGAGGAAG GACAGCAGAC CAGACAGTCA CAGCAGCCTT GACAAAACGT SO 

TCCTGGAACT CAAGCTCTTC TCCACAGAGG AGGACAGAGC AGACAGCAGA GACCATGGAG 120 

TCTCCCTCGG CCCCTCCCCA CAGATGGTGC ATCCCCTGGC AGAGGCTCCT GCTCACAGCC 180 

TCACTTCTAA CCTTCTGGAA CCCGCCCACC ACTGCCAAGC TCACTATTGA AICCACGCCG 240 

TTCAATGTCG CAGAGGGGAA GGAGGTGCTT CTACTTGTCC ACAATCTGCC CCAGCATCTT 3 00 

TTTGGCTACA GCTGGTACAA AGGTGAAAGA GTGGATGGCA ACCGTCAAAT TATAGGATAT 3 SO 

GTAATAGGAA CTCAACAAGC TACCCCAGGG CCCGCATACA GTGGTCGAGA GATAATATAC 42 0 

CCCAATGCAT CCCTGCTGAT CCAGAACATC ATCCAGAATG ACACAGGATT CIACACCCTA 4 SO 

CACGTCATAA AGTCAGATCT TGTGAATGAA GAAGCAACTG GCCAGTTCCG GGTATACCCG 540 

GAGCTGCCCA AGCCCTCCAT CTCCAGCAAC AACTCCAAAC CCGTGGAGGA CAAGGATGCT 600 

GTGGCCTTCA CCTGTGAACC TGAGACTCAG GACGCAACCT ACCTGTGGTG GGTAAACAAT S60 

CAGAGCCTCC CGGTCAGTCC CAGGCTGCAG CTGTCCAATG GCAACAGGAC CCTCACTCTA 720 

TTCAATG1CA CAAGAAATGA CACAGCAAGC TACAAATGTG AAACCCAGAA CCCAGTGAGT 780 

GCCAGGCGCA GTGATTCAGT CATCCTGAAT GTCCTCTATG GCCCGGATGC CCCCACCATT 840 

TCCCCTCTAA ACACATCTTA CAGATCAGGG GAAAATCTGA ACCTCTCCTG CCACGCAGCC 900 

TCTAACCCAC CTGCACAGTA CTCTTGGTTT GTCAATGGGA CTTTCCAGCA ATCCACCCAA 960 

GAGCTCTTTA TCCCCAACAT CACTGTGAAT AATAGTGGAT CCTATACGTG CCAAGCCCAT 1020 

AACTCAGACA CTGGCCTCAA TAGGACCACA GTCACGACGA TCACAGTCTA TGCAGAGCCA 1080 

CCCAAACCCT TCATCACCAG CAACAACTCC AACCCCGTGG AGGATGAGGA IGCTGTAGCC 1140 

TTAACCTGTG AACCTGAGAT TCAGAACACA ACCTACCTGT GGTGGGTAAA TAATCAGAGC 1200 

CTCCCGGTCA GTCCCAGGCT GCAGCTGTCC AATGACAACA GGACCCTCAC TCTACTCAGT 1260 

GTCACAAGGA ATGATGTAGG ACCCTATGAG TGTGGAATCC AGAACGAAT? AAGTGTTGAC 1320 

CACAGCGACC CAGTCATCCT GAATGTCCTC TATGGCCCAG ACGACCCCAC CATTTCCCCC 1380 

TCATACACCT ATTACCGTCC AGGGGTGAAC CTCAGCCTCT CCIGCCATGC AGCCTCTAAC 1440 

CCACCTGCAC AGTATTCTTG GCTGATTGAT GGGAACATCC AGCAACACAC ACAAGAGCTC 1500 

TTTATCTCCA ACATCACTGA GAAGAACAGC GGACTCTATA CCIGCCAGGC CAATAACTCA 1560 

GCCAGTGGCC ACAGCAGGAC TACAGTCAAG ACAATCACAG TCICTGCGGA GCTGCCCAAG 162 0 
CCCTCCATCT CCAGCAACAA CTCCAAACCC GTGGAGGA"/. I 1 

TGTGAACCTG AGGCTCAGAA CACAACCTAC CTGTGGTGGG TAAATGGTCA GAGCCTCCCA 174 0 

GTCAGTCCCA GGCTGCAGCT GTCCAATGGC AACAGGACCC TCACTCTATT CAATGTCACA 1800 

AGAAATGACG CAAGAGCCTA TGTATGTGGA ATCCAGAACT CAGTGAGTGC AAACCGCAGT 1B60 

GACCCAGTCA CCCTGGATGT CCTCTATGGG CCGGACACCC CCATCATTTC CCCCCCAGAC 1920 

TCGTCTTACC TTTCGGGAGC GAACCTCAAC CTCTCCTGCC ACTCGGCCTC TAACCCATCC 1980 

CCGCAGTATT CTTGGCGTAT CAATGGGATA CCGCAGCAAC ACACACAAGT TCTCTTTATC 2 040 

GCCAAAATCA CGCCAAATAA TAACGGGACC TATGCCTGTT TTGTCTCTAA CTTGGCTACT 2100 

GGCCGCAATA ATTCCATAGT CAAGAGCATC ACAGTCTCTG CATCTGGAAC TTCTCCTGGT 2160 

CTCTCAGCTG GGGCCACTGT CGGCATCATG ATTGGAGTGC TGGTTGGGGT TGCTCTGATA 2220 

TAGCAGCCCT GGTGTAGTTT CTTCAT1TCA GGAAGACTGA CAGTTGTTTT GCTTCTTCCT 2280 

TAAAGCATTT GCAACAGCTA CAGTCTAAAA TTGCTTCTTT ACCAAGGATA TTTACAGAAA 2340 

AGACTCTGAC CAGAGATCGA GACCATCCTA GCCAACATCG TGAAACCCCA TCTCTACTAA 2400 

AAATACAAAA ATGAGCTGGG CTTGGTGGCG CGCACCTGTA GTCCCAGTTA CTCGGGAGGC 246 0 

TGAGGCAGGA GAATCGCTTG AACCCGGGAG GTGGAGATTG CAGTGAGCCC AGATCGCACC 2 52 0 

ACTGCACTCC AGTCTGGCAA CAGAGCAAGA CTCCATCTCA AAAAGAAAAG AAAAGAAGAC 2580 

TCTGACCTGT ACTCTTGAAT ACAAGTTTCT GATACCACTG CACTGTCTGA C-AATTTCCAA 264 0 

AACTTTAATG AACTAACTGA CAGCTTCATG AAACTGTCCA CCAAGATCAA GCAGAGAAAA 2700 

TAATTAATTT CATGGGACTA AATGAACTAA TGAGGATTGC TGATTCTTTA AATGTCTTGT 2 760 

TTCCCAGATT TCAGGAAACT TTTTTTCTTT TAAGCTATCC ACTCTTACAG CAATTTGATA 2 82 0 

AAATATACTT TTGTGAACAA AAATTGAGAC ATTTACATTT TCTCCCTATG TGGTCGCTCC 2880 

AGACTTGGGA AACTATTCAT GAATATTTAT ATTGTATGGT AATATAGTTA TTGCACAAGT 294 0 
TCAATAAAAA TCTGCTCTTT GTATAACAGA AAAA 

1 11 21 31 41 51 

! I I I I I 

MESPSAPPHR WCIPWQRLLL TASLLTFWNP PTTAKLTIES FPFNVAEGK3 LLLVHNLPQ 60 

HLFGYSWYKG ERVDGNRQI I GYVIGTQQAT PGPAYSGREI IYPNA3LLIQ NIIQNT3TGFY 120 

TLHVIKSDLV NEEATGQFRV YPELPKPSIS SNNSKPVEDK DAVAFTCEPS TQDATYLWWV 180 

NNQSLPVSPR LQLSNGNRTL TLFNVTRNDT ASYKCETQNP VSARR3DSVI LNVLYGPDAP 24 0 

TISPLNTSYR SGENLNLSCH AASNPPAQYS WFVNG7FQQS TQELFIPNIT VIITI3GSYTCQ 300 

AHNSDTGLNR TTVTTITVYA EPPKPFITSN NSNPVEDEDA VALTCEPEIQ NTTYLWWVNN 360 

QSLPVSPRLQ LSNDNRTLTL LSVTRNDVGP YECGIQNELS VDHSDPVILN VLYGPDDPTI 420 

SPSYTYYRPG VNLSLSCHAA SNPPAQYSWL IDGNIQQHTQ ELFISMITEK NSGLYTCQAN 480 

NSASGHSRTT VKTITVSAEL PKPSISSNNS KPVEDKDAVA FTCEPEAQNT TYLWWVHGQS 540 

LPVSPRLQLS NGNRTLTLFN VTRNDARAYV CGIQNSVSAN RSDPVILDVL YGPDTPIISP 600 

PDSSYLSGAN LNLSCHSASN PSPQYSWRIN GIPQQHTQVL FIAKITPNNN GTYACFVSNL 660 
ATGRNNSIVK SITVSASGTS PGLSAGATVG IMIGVLVGVA LI 
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Coding sequence 



ATGGCGAAAG ACAACTCAAC TGTTCGTIGC TTCCAGGGCC TGCTGATTTT 



GGCTGCCTGG 
AGTATATGCC 



CACAGCCTCT 



ATGAAGTCCA 
TTTGAAGTGG 
TTCCTGAAGC 



TTGTGGGCAT 



CAATTGCTGT 
TGAGAATAAT 
AGAACCTCTC 
CTGCTATGAA 
ATTTGCCATT 
AAT TGAAT AT 



GATGCTGACT 
AACCTGGAGG 
CTGATCTCTG 
CTCTGCTGGA 
TAAGAA 



CATCTTGTAT 
AGATGCTAGA 
ATGGAGTCAC 
GTCCATCAGA 
ATCCCTGGCC 

GTCCAATGAA 



TGAAGCCACC GACAACGATG A 
CTGCCTCTTC TGCCTGTCTG TTCTAGGCAT 
TCTTCTGGCG TATTTCATTC TGATGTTTAT 
CACAGCAGCA ACACAACGAG ACTTTTTCAC 
GAGGTACCAA AACAACAGCC CTCCAAACAA 
CAAAACCTGG GACAGGCTCA TGCTCCAGGA 
CTGGCAAAAA TACACATCTG CCTTCCGGAC 
TCGTCAATGC TGTGTTATGA ACAATCTTAA 
AGGCGTGCCT GGTTTTTATC ACAATCAGGG 
CCGACACGCC TGGGGGGTTG CCTGGTTTGG 
TCTCCTGGGT ACCATGTTCT ACTGGAGCAG 



I 



I 



I 



I 



I 



MAKDNSTVRC PQGLLIPGNV IIGCCGIALT AECIFFVSDQ HSLYPLLEAT DNDDIYGAAW 
IGIPVGICLP CLSVLGIVGI MKSSRKILLA YFILMFIVYA FEVASCITAA TQRDFFTPNL 
FLKQMLERYQ NNSPPNNDDQ WKNNGVTKTW DRLMLQDNCC GVNGP3DWQX YTSAPRTENN 
DADYPWPRQC CVMNNLKEPL NLEACKLGVP GFYHNQGCYE LISGPMNRHA WGVAWFGFAI 
LCWTFWVLLG T 



I 

GCCGGACAGA 
ACCTCTGTCC 
AAGATTTCAA 
AAGAGAACAC 
ACAGTTTTTG 
ACCATATAAA 
TGAGAAG AT T 
TGAAATAGAA 
AGTAACTAAC 
GGCTTGCATG 



TCTGCGCGTA 
CAAGCAAGAG 
AGCTGGAAAA 



CTTTCAACAA 



TATTCATGCA 



TCTTTGCACC 
TCCAGATGGA 
ATTTACTGCT 
CCAGGAATTG 



CAAGCATTTG 
AGAAAGGGAA 
TTGATACCAG 
GGTTTGGCAA 
GCCCAGGAAG 
AGGGTGTACA 
AAATACATTG 
AAGATGGCTT 
AAATACAGTC 
CTCCGCAGCT 



I 

' TCCTGGAGCC 
AGATGAATGG 
GGGGAAGAGG 
TGAGTAAAAC 
AGACCCCACA 
TTTATTTCTC 



AGAGTATAGA 
TGGTGGGAAC 
CACAGGAAAA 
GTCAATGCAG 



TGAACIAGGA 
GGCAGAGGAT 
TTCTCAGGAA 



GTATTTTGGT 



TACATCAGGT 
GATTGTCTAA 
ACTATGAGCC 
CTCTAAGAGG 
TTCTTTGTGC 



CACAAGGCAT 
AGATTTTAAA 
TGAACTAAGA 



AATTGTCAAA 
CCTTTTGTAT 
TGAGGATGGG 



CATGATCTTG 
GTCTCAAATG 
ATTGAAGCAA 
TGTAAGCATG 
CAAGCTGAAG 
CATGAACTTG 
GATGACAAAA 
GGCCTAGGAA 



TGCATGTGGA 
GTGTCCTGTG 
AGTTACGATG 
AGCAGGTCGG 



AGCGATAGCT 
ATTGATT7GT 
GAACTGACAG 
GATGCACCTG 
GACCTTGAAA 
ACAATGGTAA 
CTCAAGAATG 
CGTGTCAGTA 



51 

I 

GAGCTTTGGG 
TTGGACGAGG 
AATGGAGAGA 
AACAAACCCC 
ATCGATTCAT 
CTCCTTTGAT 



ATGTGCCACA 
TCAGAGCAAA 
ATATAAAGCC 



CATGGAGCAG 
AACTTCCATT 
T1CTGAGAAT 



TCTGGAGATT 
ATCGATGAAT 
CAAAGTATTA 
ATTGCTGCTG 
TTAAAAATGG 



ATTCTATTAG 
GAATGTTGAT 
AAAACCTGTT 
TTAAAGCAGG 
ACAGAATTCC 
AAAGTCAAAT 
ACACCACGAC 
TTGCTTTGGA 
TTGATAAGAT 



TTCTCGAAAT 
TAATAGCAAA 
GGAGTTCTCA 
TAAACTCATT 
TTTGGCATTA 
AATTCGGGGA 
GCTACAGGCA 
CACCTCTGGT 



CCTGTGTGTC 
GACTGGCAGT 
ATTCCACGAA 
GACACAGTGA 
AAGAATGACA 



CTATTACTGG 



GTCAACTCGC 



C GTACTTGAAG 



CCTTGAGCTC 
CAAAGAAGAC 



CAAATCCAGT 
GGAGTGCACT 
ATCATGATCA 
TTAGCAGTGC 
TAGTTTCTGA 
CCATTCCCCA 
GGCTATCCAC 
GCCAGAGGTT 
CAGAGGCACG 
TAGTGGAAAT 
ATTTTGAGCG 



ACTATCCAGA 
CTTACTCTCT 
CACAGTAGCT 
GAAGCCATTA 



GCGTGCAATG 
CTGACGGTAA 
CTGGTACTTG 
CATCAAGCCI 

TACAATAAAG 
TTTGATTTGG 
GAACATG7GR 
CGTATGAATA 
TCAGAAAGAC 
AGAAAGTACA 



TGTTGGAAGC 



CCAAAACAGT 
TCTTTATCCT 
TTGCAATAAG 
GTCAAGATTC 
TAAAGGTGGT 
TTGGCTATGC 



TTTTGAAAAT 



TTTATTGGAT 
AGCCATCTCA 



TTCGGCAGAT 



TATGAAATAT 
ATCCCAGCAT 
CAACAACGTT 



TGTAAAAGGA 



CCAGGGTTAC 
CTTCACCAAG TTAGGC-CCTC CTGGGTTIAT 
GCGTGCACGC ACAGACAGAC AGACACACAC 
ACACACAGTC AAATACTGTT CTCTGAAAAA 
AAGCATTAAA TATAATAAAC TAATTTAAGA 
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AGTGATAAAG TCTCCAGATG CAGTAGCTCA CACTGTAATC ACAGTGACTC AGGAGGCTGA 2880 

GGTGAGAGGA TTCCTTGAGG CCAGGGTTCG AGACCAACC7 TGGGCAACAT AGCAAGACCC 2940 

CATTTCTTAA AAAAAAAAAA AAAAAATTTA AACITAGCTG GC-TATGGTGG CACATGCCTA 30 0 0 

TAGTCTCAGC TACTTGTGAG GCTGAGGCAG GAGGATTCTT TC-AGCCCAGG AGTTTGAGGT 30 60 

TACAGTGAGC CACAATCACA CCAAT CACTG CACTCCAGCC TGGGCAA7AA AGTAACTCTT 3120 

GACTCAAAAA AATAAAAAAA ATTGTAGTGG TAGCCATGTG TTAATTGTTA AATAAATTCT 3180 

CCAAAGGGCT AAAAGTAAAT TACTTATAAA TTTTTTATAG TTGTATT7TT GACCTGCCTT 324 0 

TTATATGTAT GAATATTTCA TAGTTTTGCA TATCAGATGT AGGCATACAG ACAAATACAT 3300 

AAACCAATGA ATATATTACA TATTCTGTGT TCCAATAAAA CTTTATTTAT GGACACTAAA 3360 

ATTTGAATTT CATAAAATTT TCCCATGTCA AGAATACAAA ATACTTGAGT TTTGTTTTTA 3420 

GCTATTTAAT AATAGGTCTC ATTTATTCCA CAGGCTGTAG TTTGTAG7CT TGCTTGAAAC 3480 

AATAGAAACA GACTGATTAA GCAGGAGAAG TTTTTTGAAA GAATTTTGTT TGGCTCACGG 3540 

AATTATTAGA AGGCAGGTGA ACCAGGAGGG TAAGCTTCCA GCAGCAA-TT GTAAAACCAT 3600 

GCCTTAGAAT TGGACTAAGG AAGAAGCTGC TGACACTCCA CTGCCACACA GGGCACTGGA 3660 

AGAAAGTGCT GCTGCCTCCC TGCCCCACCT TTGCCAC1TC TGCAGCAGGA ATAGGTAGAA 3720 

GAATGCCCCC ACCCGCACCG G AAC AG CAAC AAAAGGATTC TGCATGAGAT GCCTCCCTAA 3780 

ATTGCTGAAT TCAAAAAAGA AGTTGCATAC AAAGACATCT GAT7GAAAAA GGGTATGTTA 3840 

TATGCCCCTT TCATAGGCTG CTAGGGAGTT TTCCTGGTTC TACTTTCAGG TGGTGGGATC 3900 

AATAAGACCA GAATTTCTCA TATGTTGTGA GAGGATTCAA ATGTTACAGG GTTGCCAGCC 3 960 

AAACTATCAA TCATGTATAA ATCCAACAAA CACTTTGTAA CATACAAGAA CTCAGGAAAT 4 02 0 

GTGAACCATT GTTGGAGAAT CTACTAAAAT ACGGCTTCCC GCAAACGAAG ATGAATGGAA 4 080 

AATGTAAATA AAAAGAACTG GCAGTGTATA TCAGATGTTT AACTATAGGA CCAGAACTAA 4140 

GATGTGGAGA CTATTGCCAT AGACCACAAT GTAAATTTTT AAGTGAGGAA GGAAAAATCA 42 00 

GGAATCAAAA GGGGCCAGGT GCAGTGGCTC ACATCTATAA TCCCAGAGCT TTGGGAGTTC 42 60 

GAGGCAGGAG GATCACTTGA AGCCAGTTTT GAGACCAGCC TATGCAACAC ATTGAGACCC 432 0 

TATCTCTACA AAAAATAGAT TAGCTGGGCA CGGTGGTGCA TGCCTATTGT CCTACCTACT 4380 

GTGGAGGCTG AAGTAGGAAA TCACTTGAGC CCGAGAGTTT GAGGTTACAG TGAGCTATGA 4440 
TTATACCACT GCACTCCAGC CTGGGCAAGA GAGCAAGACC TTGTCTCTT 



1 11 21 31 41 51 

I I I I I I 

MNGEYRGRGP GRGRFQSWKR GRGGGNPSGK WREREHRPDL SKTTGKRTSE QTPQFLLSTK 
TPQSMQSTLD RFIPYKGWKL YFSEVYSD3S PLIEKIQAFE KFFTRHIDLY DKDEIERKGS 
ILVDFKELTE GGEVTNLIPD IATELRDAPE KTLACMGLAI HQVLTKDLER HAAELQAQEG 
LSNDGETMVN VPHIHARVYN YEPLTQLKNV RANYYGKYIA LRGTWRVSN IKPLCTKMAF 
LCAACGEIQS FPLPDGKYSL PTKCPVPVCR GRSPTALRSS PLTVTMDWQS IKIQELMSDD 
QREAGRIPRT IECELVHDLV DSCVPGDTVT ITGIVKVSNA EEGSRNKNDK CMFLLYIEAN 
SISMSKGQKT KSSEDGCKHG MLMEPSLKDL YAIQEIQAEE NLFKLIVNSL CPVIFGHELV 
KAGLALALFG GSQKYADDKN RIPIRGDPHI LWGDPGLGK SQMLQAACNV APRGVYVCGN 
TVT LSKDSSSGDF ALEAGALVLG DOGICGIDEF DKMGNQHQAL LEAMEQQSIS 



AIRAGKQRTI SSATVARMNS QDSNTSVLEV VSEKPLSERL KWPGETIDP 

IPHQLLRKYI GYARQYVYPR LSTEAARVLQ DFYLELRKQS QRLN3SPITT RQLESLIRLT 

EARARLELRE EATKEDAEDI VEINKYSNLG TYSDEFGNLD FERSQHGSGM SNRSTAKRFI 

SALNNVAERT YNNIFQFHQL RQIAKELNIQ VADFENFIGS LKDQGYLLXK GPKVYQLQTM 



I I I I I 1 

ACCAGATCCC AGAGGCTGAA CACCTCGACC TTCTCTGCAC AGCAGATGAT CCCTGAGCAG 

CTGAAGACCA GAAAAGCCAC TAAGACTTTC TGCTTAATTC AGGAGCTTAG AGGATTCTTC 

AAAGAGTGTG TCCACGATCC TTTGAAGCAT GAGTTCTTAC CAGCAGAAGC AGACCTTTAC 

CCCACCACCT CAGCTTCAAC AGCAGCAGGT GAAACAACCC AGCCAGCCTC CACCTCAGGA 

AATATTTGTT CCCACAACCA AGGAGCCATG CCACTCAAAG GTTCCACAAC CTGGAAACAC 

AAAGATTCCA GAGCCAGGCT GTACCAAGGT CCCTGAGCCA GGCTGTACCA AGGTCCCTGA 

GCCAGGCTGT ACCAAGGTCC CTGAGCCAGG TTGTACCAAG GTCCCTGAGC CAGGCTGTAC 

CAAGGTCCCT GAGCCAGGTT GTACCAAGGT CCCTGAGCCA GGCTACACCA AGGTCCCTGA 

ACCAGGCAGC ATCAAGGTCC CTGACCAAGG CTTCATCAAG TTTCCTGAGC CAGGTGCCAT 

CAAAGTTCCT GAGCAAGGAT ACACCAAAGT TCCTGTGCCA GGCTACACAA AGCT AC CAGA 

GCCATGTCCT TCAACGGTCA CTCCAGGCCC AGCTCAGCAG AAGACCAAGC AGAAGTAATT 

TGGTGCACAG ACAAGCCCTT GAGAAGCCAA CCACCAGATG CTGGACACCC TCTTCCCATC 

TGTTTCTGTG TCTTAATTGT CTGTAGACCT TGTAATCAGC ACATTC-TCAC CCCAAGCCAT 

AGTCTCTCTC TTATTTGTAT CCTAAAAATA CGTACTATAA AGCTTTTGTT CACACACACT 

CTGAAGAATC CTGTAAGCCC CTGAATTAAG CAGAAAGTCT TCATGGCTTT TCTGGTCTTC 

GGCTGCTCAG GGTTCATCTG AAGATTCGAA TGAAAAGAAA TGCATGTTTC CTGCTCTTCC 
CTCATTAAAT TGCTTTTAAT TCCA 



MSSYQQKQTF TPPPQLQQQQ VKQPSQPPPQ EIFVPTTKEP CHSKVPQPGK TKIPEPGCTK 
VPEPGCTKVP EPGCTKVPEP GCTKVPEPGC TKVPEPGCTK VPEPGYTKUP EPGSIKVPDQ 
GFIKFPEPGA I KVPEQG YTK VPVPGYTKLP EPCPSTVTPG PAQQKTKQK 
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C CTCATTGCCC 

AAGGCTCGTT AGAATTCGCC CTAGAGCTGT ATCATGTATT TTCTTTCAAA TTAACTTTGC 
A GCTTAGGGAA CCAGCAACAA AAGCAAACTT GGCCCGAGGT CGTTCACCGC 
A AAGATTCCTG CQGCCAC-CGC 
A GTGCCCCGAC CGCAGAGGCG ACGACAGGGG AGCAGGAAGC TGCTCACGGT 
AGTCGGCGTT GGCGGCAGCG GTGGCCTTCC TCATCTGGGC GATC-TGGGCT CCTAGAAGAG 
TAAGGATAAC ATCCTGGAAA TGACTTCTGT ACGGTTTGAG CCCAACTGCA CACTCATGAC 
TTGGAGCTGC CCTGTGGAGT TACAGTTTAC CAAACACATT CATC-AACATA ATCTCATTTA 
CTAAAAACTT TGTGAGAATT TTCTTTTACT AAAATTTTTT CTTATTACAA A 

Seq ID NO : 4 8 DNA sequence: 



50 
55 
60 



I 



A AAAGAAAATT CTCACAAAGT 

TTTTAGTAAA TGAGATTATG TTCATGAATG TGTTTGGTAA ACTGTAACTC CACAGGGCAG 

C1CCAAGTCA TGAGTGTGCA GTTGGGCTCA AACCGTACAG AAGTCATTTC CAGGATGTTA 

TCCTTACTCT TCTCGGAGCC CACATCGCCC AGATGAGGAA GGCCACCGCT GCCGCCAACG 

CCGACTACCG TGAGCAGCTT CCTGCTCCCC TGTCGTCGCC TCTGCGGTCG GGGCACTTTC 

CCCAAAGCGC TGGCCGCAGG AATCTTTCCC CTTAAATCGG GGAAGAAGTT TCTCTAATCC 

ATTTTCGCGG TGAACGACCT CGGGCCAAGT TTGCTTTTGT TGCTGGTTCC CTAAGCTTAA 



TTGCAAGCAA AGTTAATTTG AAAGAAAATA CATGATACAG CTCTAGGGCG AATTCTAACG 
AGCCTTGGGC AATGAGGGAA GAACGTGTCT AGTTATCCAC AGCCCGGGGA CGCCTGCACA 
CGACGCT 

Seq ID NO : 4 9 DNA sequence 

Nucleic Acid Accession #: CAT cluster 



CCCAACCAAA 



GCTGCTCGTT 
TCTGTTGTCT 
CGAAACCATT 
GACCAGTTTT 
GTGTTTAAAA 



TGTCTCTCCT 
CTTCTCTGAT 
GGGTGTCACA 
CCTGGAAOJG 



CTTTCTCTGA 
TCGCTGTAGC 
TAAAGAAACT 



ATTTATATTA 
TTTTCTTTGA 
CGTTTCATTA 



CATGACCATG 
CACCCCCAAA 
AATACAMGA 
CAGCATCTCC 
TTATGGTAGC 




GGGAGAAGCT GACCGGTGAG 
CGGCTTTTTT GGGAGAACCC 
GGGCCGCCTC 



TACCCAAACC CTTTATGCTT 
TGCTGCTTCA 
GCTGTATTCC 
CAGAGTTTGC AGTGAGCCGA 



TGAAGGAAAG 
AG AGT CAT AA 
GGAAATGGAT 
ATTTCTAGCT 
CCCCTCCCTT 
ACAACCATCT 
CCAGGGTTAA 
CCCTGCACCT 
CAGCCAGCTA 
GATGAGGATG 



ATTAGCCCCT 
CAGGTTTTCC 
GTAAATTATT 
GGAAGGTCTT 
TCCACCTTCA 
TCCCACCTAT 



GAAAGTCCCT 
CAGATTAGCA 
CTGAATGTGT 
GGACTCTGAG 



GGTTTTCAGG 
TCTGGGTGGA 
GAAGTAGGAG 
ACCAGTCAGG 



TCATGTGTGC 



CTCATGAAAC C 



GTGCCAAAAA 
GTAGTGTGAG 
TCCTTGAGGC 
CCCAAGGGAC 



TCATGTGTGA 
AGGGCTCATT 
CACACAGCCC 



ACAAGGGGTC 
CAAGGAGGGC 
AAGAGTGCCC 
GGCAAGGCTC 
GCCTGCTGCT 
CAGTGAAAAT 
GTGTTCATCA 



CCTCCCACCC AGGGTGCTGG 
GCTGTGGCAC ACCTTGGATG 
CGAGTGTACA CCAAACAAAG 
GCTGTATCCA GTTTCCATTG 



TGGGGGGCAG 
GTGAAGGGAA 
AAGGGTAAAG 
GGGAGGAAGG 
GGAATTGGGA 



CCACCTCAGC 
TGTCCCACAG 
AACAGGACTC 
CACCCCTCCC 



AATAAGCCGA 



AAAAGATGGG 
CCATTTCATT 
TCCTCTGCTC 
AACACGGGGA 
AGATGTCCCC 
TCAAGGCAAG 
ACATCATTTT 
GCCAACCGGT 



ATATGTGTAA GCAGGTTAAT 
T TAAATTACAG 
A TAGTCATTGA 
C AGAACTAGAA 
A TTTATAGAAA 



CCTCAGTAGA T 
AGTTCATAGC A 
TGACAAGATA T 



CATCTTATAA AAGCCAGCTG GCCATTGCCT 
ATTCTGCTCC GTATACCAGG TAAGTCTCTG 
TTTTCATTAT TAGAAATTAG CTAAAGGCAA 
AATGGGAGAT AGAGAATAGT GGAATATCTT 
AAAGGACCTT AGAGATGGIT AGGGCTCCCA 



AGTAATTGGC 



AGATGGGAAG 
GAGGCTTAGA 
GAGGAAAGTG 
GAAGCCAGCT 



CTAGCAGGAA 
ATGACGGAGA 
TAGGGTGTCA 
AAAAGCATTT 
TGAATATAAA 



CTCAGGCCAG 
TTTTAATTTA 
CACTTTGGAC 
GGTAATACAT 
TGGG CAGAGA 



AGCACTCTCA GTAACACTGC AATTTCCCCC 
TTAGATGGAT CTCTACTGAG CATTTATTCC 
AAATCAATGC CCTAACGTAC TTACTTAACA 



GGAAGGGACT 
GCCATCCTAT 
ATTTTCCAAA 



3TACTAGGA 

AGACCTAATA TGCGGACCTC A 
AACAGATATA AGGTGCCTTG G 
GCAGAAGAAA TTGCCTTTTA GCTCCTCCTC 
TTTCATTTTT AAATGTAATG GGGGAGCTAA 



207 



75 



WO 02/086443 PCT/US02/12476 

GGGAGATGAA AGGCTTTCTC TTCTAAAGGC- TCCTGAAATA AAATCTGTTT GGCATTGAAT 1920 

TTGTATCCAT CTTTCTTTAA TTGAATCACT GTGTCAGCTT TCTGTCTCTA GAAAAAAACA 1980 

CATTTGAAGC ATGAATTCTC AGCAGCAGAA GCAGCCTTGC ACCCCACCCC CTCAGCCTCA 2040 

GCAGCAGCAG GTGAAACAAC CTTGCCAGCC TCCACCCCAG GAACCATGCA TCCCCAAAAC 2100 

CAAGGAGCCC TGCCAACCCA AGGTGCCTGA GCCCTGCCAC CCCAAAGTGC CTGAGCCCTG 21S0 

CCAGCCCAAG ATTCCAGAGC CCTGCCAGCC CAAGGTGCCT GAGCCCTGCC CTTCAACGGT 2220 

CACTCCAGCA CCAGCCCAGC AGAAGACCAA GCAGAAGTAA TG7GGTCCAC AGCCATGCCC 22 80 

TTGAGGAGCT GGCCACTGGA TACTGAACAC CCTACTCCAT TCTGCTTATG AATCCCATTT 2340 

GCCTATTGAC CCTGCAGTTA GCATGCTGTC ACCCTGAATC ATAATCGCTC CTTTGCACCT 2400 

CTAAAAAGAT GTCCCTTACC CTCATTCTGG AGGCTCCTGA GCCTCTGCGT AAGGCTGAAC 24S0 

GTCTCACTGA CTGAGCTAGT CTTCTTGTTG CTCGGGTGCA TT7GAGGATG GATTTGGGGA 2520 
AGGT CAAGTG A 



Seq ID NO; 52 DNA sequence 

Mucleic Acid Accession it: NM_002638.1 

Coding sequence: 120-473 



I I I I I I 

CAATACAGCT AAGGAAT TAT CCCTTGTAAA TACCACAGAC CCGCCCTGGA GCCAGGCCAA 
GCTGGACTGC ATAAAGATTG GTATGGCCTT AGCTCTTAGC CAAACACCTT CCTGACACCA 
3 ATCGTGGTGG TGTTCCTCAT CGCTGGGACG CTGGTTCTAG 



C CATGTTGAAT CCCCCTAACC GCTGCTTGAA AGATACTGAC TGCCCAGGAA 
TCAAGAAGTG CTGTGAAGGC TCTTGCGGGA TGGCCTGTTT CGTTCCCCAG TGAAGGGAGC 
CGGTCCTTGC TGCACCTGTG CCGTCCCCAG AGCTACAGGC CCCATCTGGT CCTAAGTCCC 
TGCTGCCCTT CCCCTTCCCA CACTGTCCAT TCTTCCTCCC ATTCAGGATG CCCACGGCTG 
GAGCTGCCTC TCTCATCCAC TTTCCAATAA A 



40 



45 



Seq ID NO: 54 DNA sequence 
50 Nucleic Acid Accession tt: NM_0 19618 
Coding sequence: 75-584 

„ i i 1 T i 1 i 1 i 1 

55 GGCACGAGCC ACGATTCAGT CCCCTGGACT GTAGATAAAG ACCCTTTCTT GCCAGGTGCT SO 

GAGACAACCA CACTATGAGA GGCACTCCAG GAGACGCTGA TGGTGGAGGA AGGGCCGTCT 120 

ATCAATCAAT GTGTAAACCT ATTACTGGGA CTATTAATGA TTTGAATCAG CAAGTGTGGA 180 

CCCTTCAGGG TCAGAACCTT GTGGCAGTTC CACGAAGTGA CAGTGTGACC CCAGTCACTG 240 

TTGCTGTTAT CACATGCAAG TATCCAGAGG CTCTTGAGCA AGGCAGAGGG GATCCCATTT 300 

60 ATTTGGGAAT CCAGAATCCA GAAATGTGTT TGTATTGTGA GAAGGTTGGA GAACAGCCCA 3 60 

CATTGCAGCT AAAAG AG C AG AAGATCATGG ATCTGTATGG CCAACCCGAG CCCGTGAAAC 42 0 

CCTTCCTTTT CTACCGTGCC AAGACTGGTA GGACCTCCAC CC7TGAGTCT GTGGCCTTCC 430 

CGGACTGGTT CATTGCCTCC TCCAAGAGAG ACCAGCCCAT CATTCTGACT TCAGAACTTG 540 

GGAAGTCATA CAACACT3CC TTTGAATTAA ATATAAATGA CTGAACTCAG CCTAGAGGTG 600 

65 GCAGCTTGGT CTTTGTCTTA AAGTTTCTGG TTCCCAATGT GT7TTCGTCT ACATTTTCTT 660 

AGTGTCATTT TCACGCTGGT GCTGAGACAG GGGCAAGGCT GCTGTTATCA TCTCATTTTA 720 

TAATGAAGAA GAAGCAATTA CTTCATAGCA ACTGAAGAAC AGGATGTGGC CTCAGAAGCA 780 

GGAGAGCTGG GTGGTATAAG GCTGTCCTCT CAAGCTGGTG CTGTGTAGGC CA CAAGGCAT 840 

CTGCATGAGT GACTTTAAGA CTCAAAGACC AAACACTGAG CTTTCTTCTA GGGGTGGGTA 900 

70 TGAAGATGCT TCAGAGCTCA TGCGCGTTAC CCACGATGGC ATGACTAGCA CAGAGCTGAT 960 

CTCTGTTTCT GTTTTGCTTT ATTCCCTCTT GGGATGATAT CATCCAGTCI TTATATGTTG 1020 

CCAATATACC TCATTGTGTG TAATAGAACC TTCTTAGCAT TAAGACCTTG TAAACAAAAA 1080 

TAATTCTTGT GTTAAGTTAA ATCATTTTTG TCCTAATTGT AATGTGTAAT CTTAAAGTTA 1110 
AATAAACTTT GTGTATTTAT ATAATAAAAA AAAAAAAAAA AAA 



l )t: NP_0625G4 



MRGTPGDADG GGRAVYQSMC KPITGTINDL NQQVWTLQGQ NLVAVPRSDS VTPVTVAVIT 
CKYPEALEQG RGDPIYLGIQ NPEMCLYCEK VGEQPTLQLK EQKIMDLYGQ PSPVKPFLFY 
RAKTGRTSTL ESVAFPDWFI ASSKRDQPII LTSELGKSYN TAFELNIND 

Seq ID NO: 56 DNA sequence 

Nucleic Acid Accession #: NM_003125 

Coding sequence: 65-3 34 



208 



25 
30 



WO 02/086443 PCT/US02/12476 
I I' I 1 I 1 I' . f 

AGCAGTTCTA AGGGACCATA CAGAGTATTC CTCTCTTCAC ACCAGGACCA GCCACTGTTG 60 

CAGCATGAGT TCCCAGCAGC AGAAGCAGCC CTGCATCCCA CCCCCTCAC-C TTCAGCAGCA 120 

GCAGGTGAAA CAGCCTTGCC AGCCTCCACC TCAGGAACCA TGCATCCCCA AAACCAAGGA 180 

GCCCTGCCAC CCCAAGGTGC CTGAGCCCTG CCACCCCAAA GTGCCTGAGC CCTGCCAGCC 240 

CAAGCTTCCA GAGCCATGCC ACCCCAAGGT GCCTGAGCCC TGCCCT7CAA TAC-TCACTCC 300 

AGCACCAGCC CAGCAGAAGA CCAAGCAGAA GTAATGTGGT CACA AT GGCCTTGAGG 360 

AGCCGGCCAC CAGATGCTGA ATCCCCTATC CCATTCTGTG TATGAGTCCC ATTTGCCTTG 420 

CAATTAGCAT TCTGTCTCCC CCAAAAAAGA ATGTGCTATG AAGCTTTCTT TCCTACACAC 480 

TCTGAGTCTC TGAATGAAGC TGAAGGTCTT AGTACCAGAG CTAGTTTTCA GCTGCTCAGA 540 

ATTCATCTGA AGAGAGACTT AAGATGAAAG CAAATGATTC AGCTCCCTTA TACCCCCATT 600 
AAATTCACTT TCAATTCCA 



Seg ID NO: 58 DNA sequence 
Nucleic Acid Accession it: NM_001793 .2 
Coding sequence: 71-2560 

1 11 21 31 41 51 

I I I I I I 

AAAGGGGCAA GAGCTGAGCG GAACACCGGC CCGCCGTCGC GGCAGCTGCT TCACCCCTCT 
CTCTGCAGCC ATGGGGCTCC CTCGTGGACC TCTCGCGTCT CTCCTCCTTC TCCAGGTTTG 
CTGGCTGCAG TGCGCGGCCT CCGAGCCGTG CCGGGCGGTC TTCAGGGAGG CTGAAGTGAC 
CTTGGAGGCG GGAGGCGCGG AGCAGGAGCC CGGCCAGGCG CTGGGGAAAG TATTCATGGG 
CTGCCCTGGG CAAGAGCCAG CTCTGTTTAG CACTGATAAT GATGACTTCA CTGTGCGGAA 
TGGCGAGACA GTCCAGGAAA GAAGGTCACT GAAGGAAAGG AATCCATTGA AGATCTTCCC 
ATCCAAACGT ATCTTACGAA GACACAAGAG AGATTGGGTG GTTGCTCCAA T. 
TGAAAATGGC AAGGGTCCCT TCCCCCAGAG ACTGAATCAG CTCAAGT 
AGAC AC CAAG ATTTTCTACA GCATCACGGG GCCGGGGGCA GACAGCCCCC C 
CTTCGCTGTA GAGAAGGAGA CAGGCTGGTT GTTGTTGAAT AAGCCAC 

GATTGCCAAG TATGAGCTCT TTGGCCACGC TGTGICAGAG AATGGTGCCT CAGTGGAGGA 6 60 

CCCCATGAAC ATCTCCATCA TCGTGACCGA CCAGAATGAC CACAAGCCCA AGTTTACCCA 720 

GGACACCTTC CGAGGGAGTG TCTTAGAGGG AGTCCTACCA GGIACTTCTG TGATGCAGGT 780 

GACAGCCACG GATGAGGATG ATGCCATCTA CACCTACAAT GGGGTGGTTG CTTACTCCAT 84 0 

CCATAGCCAA GAACCAAAGG ACCCACACGA CCTCATGTTC ACCATTCACC GGAGCACAGG 900 

CACCATCAGC GTCATCTCCA GTGGCCTGGA CCGGGAAAAA GTCCCTGAGT ACACACTGAC 960 

CATCCAGGCC ACAGACATGG ATGGGGACGG CTCCACCACC ACGGCAGTGG CAGTAGTGGA 102 0 

GATCCTTGAT GCCAATGACA ATGCTCCCAT GTTTGACCCC CAGAAGTACG AGGCCCATGT 10 8 0 

GCCTGAGAAT GCAGTGGGCC ATGAGGTGCA GAGGCTGACG GTCACTGATC TGGACGCCCC 1140 

CAACTCACCA GCGTGGCGTG CCACCTACCT TATCATGGGC GGTGACGACG GGGACCATTT 1200 

TACCATCACC ACCCACCCTG AGAGCAACCA GGGCATCCTG ACAACCAGGA AGGGTTTGGA 12 60 

TTTTGAGGCC AAAAACCAGC ACACCCTGTA CGTTGAAGTG ACCAACGAGG CCCCTTTTGT 1320 

GCTGAAGCTC CCAACCTCCA CAGCCACCAT AGTGGTCCAC GTGGAGGATG TGAATGAGGC 1380 

ACCTGTGTTT GTCCCACCCT CCAAAGTCGT TGAGGTCCAG GAGGGCATCC CCACTGGGGA 1440 

GCCTGTGTGT GTCTACACTG CAGAAGACCC TGACAAGGAG AATCAAAAGA TCAGCTACCG 1500 

CATCCTGAGA GACCCAGCAG GGTGGCTAGC CATGGACCCA GACAGTGGGC AGGTCACAGC 15 60 

TGTGGGCACC CTCGACCGTG AGGATGAGCA GTTTGTGAGG AACAACATCT ATGAAGTCAT 162 0 

GGTCTTGGCC ATGGACAATG GAAGCCCTCC CACCACTGGC ACGGGAACCC TTCTGCTAAC 1680 

ACTGATTGAT GTCAATGACC ATGGCCCAGT CCCTGAGCCC CGTCAGATCA CCATCTGCAA 1740 

CCAAAGCCCT GTGCGCCAGG TGCTGAACAT CACGGACAAG GACCTGTCTC CCCACACCTC 1800 

CCCTTTCCAG GCCCAGCTCA CAGATGACTC AGACATCTAC TGGACGGCAG AGGTCAACGA 1860 

GGAAGGTGAC ACAGTGGTCT TGTCCCTGAA GAAGTTCCTG AAGCAGC-ATA CATATGACGT 192 0 

GCACCTTTCT CTGTCTGACC ATGGCAACAA AGAGCAGCTG ACGGTGATCA GGGCCACTGT 1980 

GTGCGACTGC CATGGCCATG TCGAAACCTG CCCTGGACCC TGGAAGGGAG GTTTCATCCT 2040 

CCCTGTGCTG GGGGCTGTCC TGGCTCTGCT GTTCCTCCTG CTGGTGCTGC TTTTGTTGGT 2100 

GAGAAAGAAG CGGAAGATCA AGGAGCCCCT CCTACTCCCA GAAGATGACA CCCGTGACAA 2160 

CGTCTTCTAC TATGGCGAAG AGGGGGGTGG CGAAGAGGAC CAGGACTATG ACATCACCCA 222 0 

GCTCCACCGA GGTCTGGAGG CCAGGCCGGA GGTGGTTCTC CGCAATGACG TGGCACCAAC 22 80 

CATCATCCCG ACACCCATGT ACCGTCCTCG GCCAGCCAAC CCAGATGAAA TCGGCAACTT 2340 

TATAATTGAG AACCTGAAGG CGGCTAACAC AGACCCCACA GCCCCGCCCT ACGACACCCT 2400 

CTTGGTGTTC GACTATGAGG GCAGCGGCTC CGACGCCGCG TCCCTGAGCT CCCICACCTC 24 60 

CTCCGCCTCC GACCAAGACC AAGATTACGA TTATCTGAAC GAGTGGGGCA GCCGCTTCAA 2520 

GAAGCTGGCA GACATGTACG GTGGCGGGGA GGACGACTAG GCGGCCTGCC TGCAGGGCTG 2580 

GGGACCAAAC GTCAGGCCAC AGAGCATCTC CAAGGGGTCT CAGTTCCCCC TTCAGCTGAG 2640 

GACTTCGGAG CTTGTCAGGA AGTGGCCGTA GCAACTTGGC GGAGACAGGC TATGAGTCTG 2700 

ACGTTAGAGT GGTTGCTTCC TTAGCCTTTC AGGATGGAGG AATGTGGGCA GTTTGACTTC 2760 

AGCACTGAAA ACCTCTCCAC CTGGGCCAGG GTTGCCTCAG AGGCCAAGTT TCCAGAAGCC 2820 

TCTTACCTGC CGTAAAATGC TCAACCCTGT GTCCTGGGCC TGGGCCTGCT GTGACTGACC 2880 

TACAGTGGAC TTTCTCTCTG GAATGGAACC TTCTTAGGCC TCCTGGTGCA ACTTAATTTT 2 940 

TTTTTTTAAT GCTATCTTCA AAACGTTAGA GAAAGTTCTT CAAAAGTGCA GCCCAGAGCT 3000 

GCTGGGCCCA CTGGCCGTCC TGCATTTCTG GTTTCCAGAC CCCAATGCCT CCCATTCGGA 30 60 

TGGATCTCTG CGTTTTTATA CTGAGTGTGC CTAGGTTGCC CCTTATTTTT TATTTTCCCT 3120 

GTTGCGTTGC TATAGATGAA GGGTGAGGAC AATCGTGTAT ATGTACTAGA ACTTTTT1AT 3180 
TAAAGAAACT TTTCCCAGAA AAAAA 

Seq ID NO: 59 Protein sequence: 



209 



WO 02/086443 



PCT/US02/12476 



MGLPRGPLAS 
QEPALFSTDN 
KGPFPQRLNQ 



LKSNKDRDTK 



DEDDAIYTYN 
TDMDGDGSTT 
AWRATYLIMG 

DPAGWLAMDP 



GWAYSIHSQ 
TAVAWEILD 
GDDGDHPTIT 



31 IVTDQND HKPKFTQDTF 



DSGQVTAVGT 
RQITICNQSP 
KQDTYDVHLS 
LVLLLLVRKK 



TWLSLKKFL 
GAVLALLFLL 
GLEARPEWL 
DYEGSGSDAA SLSSLTSSAS E 



Seq ID NO: 60 DNA sequen 
Coding sequence: 162-428 



ANDNAPMPDP QKYEAHVPEN 
THPESNQGIL TTRKGLDPEA 
VPPSKWEVQ EGIPTGEPVC 
LDREDEQFVR NNIYEVMVLA 
VRQVLNITDK DLSPHTSPPQ 
LSDHGNKEQL TVIRATVCDC 



EKETGWLLLN KPLDREEIAK 
RGSVLEGVLP GTSVMQVTA? 
VISSGLDR3K VPEYTLTIQA 
AVGHEVQRLT V 
KNQHTLYVEV 

HQKISYRILR 



AQLTDDSDIY WTAEVNEEGD 
HGHVETCPGP WKGGFILPVL 
QDYDITQLHR 
APPYDTLLVP 
DMYGGGEDD 



I TCGAACGTTC G 



AGCACCTCGG AAGCTGAGGC 
TCTCCCAGAG GAAGCAGATA 
AGCGAAAGAA GCCTCAACTT 
GTTTACTGTT TGTTCATCGA 
GTAGAGTCAT TAACAAGGAG 
GAGGTTAGAA GTCAAAGAAC 
AGATCATAAA GACATTTTTT 

Seq ID NO: 61 Protein 



TTAGCAGAAG 
CATGTACTGG 
ATATTCTTGA 
ACACATCAGT 



TGACAGAGAG GATGGCGCTG TCGACCATAG 
CTCCCCGTGG CTTTCTAAAG CGAGTCTTCA 
AAAGTGGTGA CTTATTGGTC CATCTGAACT 
AGTCCAGGAC AAACGCTTGT GCGAGTAAAT 
CCGCAGCAAA GGTAATTCTA AAGAAG AG CA 
AAGTTATGAT GCATTCTTTT GGGTGGTAAC 
TAATATGGGA TTATTAAATA TTGG 



VHRLASESRT 



2 DNA sequence 



50 
55 
60 



GAGGCGGGGG 
CCGCGCTCTG 
GAGTGACCTG 
CCATTGGCCG 
TCTCTGGAGC 



CAGCAATTTC 
AGCCAGTGCA 
GTTCGGCCTG 
CTACAAGGGG 



TCGGACCTGC CAAGGCCACC 
CGGCTTTTAC TGCCTAGGAT 
CTGGCAGAGG CGCCCCGAGT 
ACATTGTGTT 



GACGCTGCGG C 
GCGAGCCCAG 
CTTACTGGAT 
CGAAGGGCTG 



GCAAGGGACA 



CACAGGGAGA 



CAGACGGGAA GTCCCAGGAC 
TCAAGCTATT 
CACAGCCAAC 
TGCCCCTCGT 
CGGATGACTC 
TGAGAGTACA 
CTCTGACGGG 
GTGAGACCAG 



CAGGGTGTGC G 
GATGCACTTG GCTCTGGGGG TGATGTGATC 
GGCAACACXC 
CTGGCCCGAC 
CTGGTGGACA 
ATCAAGAATG 
TTCTTCTTCG 
GTGTGCACGA 

CCACGAGACC TGGTGCTGTC TGAGCCAAGC 



C CTGGTGTCCC 
A CAGCTGCCCA 
3 CTGACCCTGA 
3 TCAATGACTT 



AGCGATGACC 
CGCGCCATCC 
CATGTGGCTG 
ATCCTGATCA 



TTGCCCTCTA 
TAGAAGGGCC 
GGAGTGTGCC 
CACAGCAGCA 
GCACGGACTA 



CTCCGACTTC 
TTCCCGGAGA 
GACCTCTGCT 
GTGGACAGCG 

GCTGGGACAG CCACTGCCGA GTGAGCGGCA G 
TGTGCGGCTG 
CGCCAACAGC 
GGAACTGACC 
AGGTGCCACT 

CCTGGGCAGG GTTCAGTGTT 
TATTTGGCCG 
AGCAGACCCT 



AGGACACTAC 720 



ATCCAGAATA CCACAGCCCA 



GACA3CTCSG 
CAGCCTCCTG 
GGTCCTCAGT 



AGCCAATCCT 
CAGTACACTC 
GTCCCAGCTG 
GTGACTGTGA 
ACCACTGCCC 
GTGGCCTGGC 



TGAGGTGACC 



G GAGCCACCGC A 



CGAGGTGGCC 
AACAGACCTG 
CCCTGGTGCC 



ACCCCTGCAA 



CCGTGGCTAC 
ACTGCCCTCT 
CCTCACACTC 



TTGGAGCCTG 
CCCGCCACTT 
ATCC7GGGCC 



GGCTTAGCTA 



TGTCAGATGC 
GGATTAGCTG 
CTGCCACAGA 



CACTGTGCGG 
CCGCCGGGAG 
AACGCGAGTG 
GAGCACAGGC 



ACCCAGTACC 
GGGAGTCAGA 
GTGTCTGCTC 
CCGGAAACTC 



GAGTGGGTCC C 



GGQGACCCGT 
AGTCCAGCCA 
GAACCACCTA 
TCATCGTGGC 
GCAGCTCATC 



GATG7GACCC 
TACACTCTGC 
GAGCTGCCTG 
CGAG7GTCCT 
T C-CGCAGCACC CAGGGGGTTG 
GTTCAGGCTG 
AGTGCCAG7G 
CGGGTTGTGG 



GGGTTCCTGG C 



2100 
2280 



210 



^3 j^;S ; // : ^: : ^ 



WO 02/086443 

CCCAGTTGGT TTCTGGGGAG 
AGTATACGGT GCATGTGAGG 
TTGTGAGGAC TGCCCCTGAG 
CCAGCGACGT TCTACGGATC 
CCTGGGGCCG GAGTGAAGGC 
CTGCAGAGAT CCGGGGTCTC 
TCGGGGACCG CGAGGGCACA 
CAGCCCTGGG GACGCTTCAC 
AGCCGGTGCC CAGAGCGCAG 
AGTCCCGGGT CCTGGGGCCC 
CACAGTACCG CGTGAGGCTG 
TGACTGCGCG CACTGAGTCA 
CGATCGACTC GGTGACTTTG 
CCTGGCGGCC ACTCAGAGGC 
GGATCTCAAG CTCCCAGCGG 
TGACGCCTGT CCTGGATGGT 
GCCCCCGTGG CCTGGCGGAT 
GTGCGGAGGC TACGAGGAGG 
CACAGGCAGT TCAGGTTGGC 
TGAATGGCTC CCATGACCTT 
ACCCAAGTGG GAACAACCTG 
CAGATGCTCC TGGGCGCCGC 
CCTTGAGAGG TGACATATTC 
TGATGTTGGG AATGGCTGGA 
ACTCTGTCCA GACCTTCTTC 
GTCTGGCCAC AGCCCTGTGT 
CAGTGTATTG TCCAAAGGGC 
TTGGGCCTCC TGGCGACCCT 
CCCCTGGAi 



GCCCATGTGG 



CCTGTCTCCA 
GTGGTGCAGC 
GGCTTCCTTC 



GCCTGGACTC 
CCTGGCCAGG 
GTGACAGGGC 
GTGCGGGGTC 



CTGAGCTGGA TGGACTGGAC- CCAGATACTG 
CTGGCGTGGA TGGGCCCCCT GCCTCTGTGG 
GTGTGTCGAG GCTGCAGATC CTCAATGCTT 
GGGTCACTGG AGCCACAGCT 7ACAGACTGG 
GGCACCAGAT ACTCCCAGGA AACACAGACT 
TCAGCTACTC AGTGCGAGTG 



AAGTGCCTGG 



CTGAGGCATC 
TACCACATGC 
GTCTGGTGTT 



GTCCCCGCAG 
CGTCTCTTAC 
TGTCACACAG 
CACTCAAGAC 



GTCCTGGAGC 
CTGCTGTCTT 
GGCATTATCT 
GGCACAGCCG TGGTCACAGC TCACAGATAC 
CAGCACGTAC CAGGGGTGAT GGTTCTGCTA 



r ACAGTCATCG GCCCTCCCCA 
r TGCAAAGGAT 



TACATCCTAT 
ACACTTCCAG 
ATCTTCTCCC 
ACGCCAGTGT 
AATGCTCACC 
CCTCTTCGGC 
CTGT7CCCAC 
CCCTACATGG 
ATGTTGGCAC 
GTGGATGAAC 



2400 
2450 



30S0 
3120 
3180 
3240 

33S0 
3420 
3480 
3540 
3600 



PCT/US02/12476 



AGCCCCATCC 
GCGGACCCAG 
GCCGTGGATG 
CAGGCATCCT 



AGCAGCTGCG 
ATGGGCCAAG 
TCACTACTCA 
AACCTGGAGA 



GGCTTCTGGG C 
TCGCTTGGCG CCGGGTATGG 3720 
CCTGGACCAG G 



GCAGCCCTGG CCGCGCCGGG AATCCTGGGA C 
CAGGGTTGCC TGGCCCTCGT 
AGCCGGGGGC TCCCGGACAA 
GGGACCCTGG ACCATCGGGC 
GTGGCCCCCC AGGGCTTCCT 
GGGGTCCCCC TGGACCAGGT 
TTCCCGGAAG CCCTGGACCC 
GTGACTCTGA GGATGGAGCT 
GCCCACGGGG ACCTCCTGGA 
TGGGTGAGGC TGGAGAGAAG 
TGCCAGGGGT TGCTGGACGT 
3 AGAGAAGGGG 
; ACCCAAAGGA 



GATGGGCCTG 
TGCTCCCGGC 
TGGAGCAGAT 



C CTCGTGGACC 
A TGAAGGGTGA 
A TTGCTCCTGG 



ACCTCGAGGC 
TGGGCTTCCT 
ACTGGGGGAC 
CAAAGGCGAT 
GGAGCCTGGG 
TGGAAAGAAA 



CCAGGACCCC 



ACTCAGCGCC 



3 ACGAGATGGT 



GCTATTGGCC 
GGCGAACGTG 
CCTGGAGCCA 
GAGCCTGGTC 
GAAAAGGGAG 
3GCCCACCCG 
GGTCCCATTG 
AAGGGAGACC 
GAAGTTGGAG 
GGCGAGCGTG 




AGGGTCCTGA AGGGCCACCA GGACCCACTG 
GCCCTGGGGA CCCrGCAGTG GTGGGACCTG 
ATGTGGGGCC C6CTGG6CCC AGAGGAGCTA 
GCTTGGTTCT TCCTGGAGAC CCTGGCCCCA 
GCCTTACTGG CAGAGCAGGA CCCCCAGGTG 
CTGGGCGGCC TGGCCCCCCA GGACCTGTTG 
AGAAACCTGA CCAOCCTCCT CCGGGTGACC 
GCCTTCGGGG GGCACCTGGA GTTCGGGGGC 



AGCCGGGTCC 
AGGGAGAGCC 
TCCCTGGAGC 
GGGACCCAGG 
ATGGCCGGAG 



TGGGGACCGC 
CCCTGGGGAA 
TGTCCGAGGC 



CCCTGGGGAA AGGGGCATTG 



3 ACTCAAGGGT 
3 GGGTGTGCCA 
\ CCCGGGTCTA 



GACAAGTGGG G 



TCTGTGGATG 
GCTAAGGGGG 
GGCATCAAAG 
CCAGGAGAGC 
CCCCCTGGCC 
GGCCCTGCAG 
CCAGGACGGG 
TCAGGCCTTG 
AAGCCGGGAG 
GGTGTGCCAG 



GAGAGAGGGG 
CTGGACTCCC 
AGCCAGGTCC 



TGGAACCCCT 
TGGACTCTCT 
CAATGGTGAC 
AGAGCCTGGA 
TGGGCCTGAA 
TCATGGAGAC 
ACCTTCTGGC 
ACCTACTGGA 
GGGGTCTCCA 
AGATGGTGCC 
TCTGCCTGC-C 



CAAGGTCCCA 



CAAAGGGAGA G 
CCAAAGGTGA C 
GGGAGCCCGG A 



AGGGAGATCT G 
AGACAGGCCC T 
GCCCCCCAGG G 
TGGGACCACC T 



C CCTGGAGGCC 
3 CCAGGGCCGC 
3 GAAGATGGTC 
3 GTCCCGGGCT 
r GGCCTGCCCG 
3 ATGGGTCAGC 
A ATCCCAGGAC 
T GGACTCAAAG 





6540 



CCTGGACCAC 



GCTGT3GGAC 



AGTGGAAAAG 
CCTGTCGGAC 
CTCCCTGGAG 
GAGCCGGGAG 
GC-CCGTGCAG 
AAAGGTTTCA 
CCAGGTGTGA 
TTCCCGGGTC 
GGTCTGGCAG 



AGACCCTGGA GTAGGC-CTGC 



7380 



5 
10 
15 

20 
25 
30 



WO 02/086443 



GTGACAAAGG 
CTGGTGACAA 
AACCTGGTGC 
GTATCCGAGG 
GGGGAGTGAA 
CCCCAGGCCG 



GGGCTCAGCC 



ACGGGGCCCC CTGGCAGCAG GGGAGAGCGT 
CTCAGCTGTG 
AGGGCCTCGG 
CAAGGGCAGC 



AAGGGGGACA 
GACAATGGGG 
GGGTTGCCAG 



TGGGTGAACG 
ACCCTGGTGA 
GACTGCGTGG 



ACGGGCAGCC 



AGAAAAAGGA 
GGGAGCCTGT 
CCCCGGGCTG 
GGCCCCTGGC 
AGGCCCCAAG 
CCCAGGCCCC 



GATGTTGGCT T 



AAGGAGGGCC 
GGTGACCAGG 
AGTGGAAATG 



TGATCGGTCC C 



3 AGTGGTGGGG G 



TTCAGGGCCA 
TCCCTGGAGC 
GCGAGAAGGG 



GAAGGGTGAG 
AGAAGCTGCA 



TCATCGCATC 



AGGATGATGA 



ACACCCTGCG 
TCTATGGTGG 
GCTGCCCACC 



TGGATCACGA CCCCTCCCTA 
GCCTGTGCTC CGCGTCTCTC 
GTACTCTGAA TACTCCGAGT 
TAGTGATGAC CCCTGTTCCC 
CTGGTACCAT CGGGCTGTGA 
CTGTGGAGGG AATGCCAACC 



ATGCAGAGGA 

TGCCACTGGA 
CAGGCAGCAC 
GTTTTGGGAC 



CCCTCCCCTT 
TCAGTGACTT 
CCTGCCACCC 
ACTGGCGTCT 
GCATTAAAGC 



GGTGCTAGAG 
GGTCCCGTGG 
TGGCAGATGA 



: ATCCCCTGGA 



GTCTAGCCTT 
CTCACTGTGG 
TTGACCCAAG 
AAAGGCAAAA 



GGAGTCGGGG 
ACGTGAGCGT 
CCCCCCTGTG 
GGGGGTGGCT 



TCTCAGCAGA A 
GCGAGTGCAC G 
GACAAACCCC C 



CCTGTGATGA CA7GGTGCTG ATTC7GGGGG 



75S0 
7630 



AAGGGAGAGC 
CCCCAGGGTC 
GGAGTGCCTG 



G AGACAAGGGA GAAGCTGGTC 



GGCTCCCAGC 
GTACCCCCTG 
GACCCTGAAG 
7GCACTGCCT 
CACCCTTTTG 



GGACAGGTAC TGCCCAGGAC TGAGGCCCAG 
ACCCCACTGT 
GTCCGTTATT 
CATTGTGGCT 



PCT/US02/12476 



MTLRLLVAAL CAGILAEAPR 
LEGLVLPFSG AASAQGVRFA 
AAILHVADHV FLPQLARPGV 
EELKRVASQP TSDFFFFVND 
SEPSSQSLRV 
TEYQVTVIAL 
RVLSGGPTQQ 
LRPVILGPTS 
RLTLYTLLEG 
VRSTQGVERT 
VPGLRVWSD 



VRAQHRERVT 
TVQYSDDPRT 
PKVCILITDG 



ELRWDTSID 
GVSYIFSLTP 
LALGPLGPQA 
AHRYMLAPDA 
RRLAPGMDSV 
EMGLRGQVGP 
APGLKGSPGL 
PLGDPGPRGP 



QELGPGQGSV 
ILLSWNLVEE 
HEVATPATW 
LVLPGSQTAF 
ATRVRVAWGP 
REEGPAAVIV 
VSGEATVAEL 
VLRITWVGVT 
REGTPVSIW 
VLGPELSSYH 
SVTLAWTPVS 
VLDGVRGPEA 
VQVGLLSYSH 
PGRRQHVPGV 
QTFFAVDDGP 
PGDPGLPGRT 



RSNFREVRSF 
GDVIRAIREL SYKGGNTRTG 
KSQDLVDTAA ORLKGQGVKL FAVGIKNADP 
VSRRVCTTAG GVPVTRPPDD STSAPRDLVL 
GYKVQYTPLT GLGQPLPSER QEVNVPAGBT S 

llrdlepgtd' YEVTVSTLFG R 
ARGYRLEWRR 

PTGPELPVSP VTDLQATELP GQR\ 
DLDDVQAGLS 
VPGASGFRIS 
ARTDPLGPVR 
DGLEPDTEYT 
GATAYRLAWG 



LDGLEPATQY 
RASSYILSWR 
SVTQTEVCPR 
RPSPLFPLNG 
MVLLVDEPLR 
SLDQAVSGLA 
GAPGPQGPPG 



GEGP3AEVTA 
GSPQTLPGIS 
ATQDNAHRAE 



LDGLQPGTEY 
VPGATQYRII 
VRREPETPLA 
DITGLQPGTT 
GATGYRVSWH 
7APEPVGRVS 
IRGLEGGVSY 
PRAQGFLLHW 



PGSPGEQGPR 
EGPPGPTGRQ 
LPGDPGPKGD 



DGRNGSPGSS GPKGDRGEPG PPGPPGRLVD 



IPGLPGRAGG 
PGLSGEQGPP 
AGPEGKPGLQ 
GPTGAVGLPG 



RGPPGPQGDP 
PGLRGEQGLP 
DGPKGERGAP 
GLRGEPGSVP 
GKEGPIGFPG 
VGEAGRPGER 
GLKGAKGEPG 



GVRGPAGEKG 



SNGDQGPKGD 
QGSPGLPGQV 



7/VHVTQASSS 
VHVRAHVAGV 
RSEGGPMRHQ 
GTLHWQRGE 
RVRLSVLGPA 
PLRGPGQEVP 
GLADWFLPH 
SHDLGIILQR 



APGQVIGGEG PGLPGRKGDP GPSGPPGPRG 
PGPGEGGIAP GEPGLPGLPG SPGEQGPVGP 
GPPGAIGPKG DRGFPGPLGE . 
GEKGEPGRPG DPAWGPAVA > 
PGDRGPIGLT GRAGPPGDSG PPGEKGDPGR 
PGKAGERGLR GAPGVRGPVG EKGDQGDPGE 
TGPGAREKGE PGDRGQEGPR GPKGDPGLPG 
DRGPPGLDGR SGLDGKPGAA GPSC-PNGAAG 



GFPGVPGGTG PKGDRGETGS 
ETWDESSGSF LPVPERRRGP 
LGERGPPGPS GLAGEPGKPG 
PGTPGPPGPP GPKVSVDEPG 
GEPGPRGQDG NPGLPGERC-M 



RGVPGIKGDR 
PGLAGPAGPQ 
GETGKPGAPG 



GWGFPGQTG PRGEMGQPGP S 



PGVGVPGSPG 



3 PRGAKGDMGE 
GLLGPQGQPG AAGIPGDPGS 
GDKGEAGPPG RPGLAGHKGE 
GERGTPGIGG 
APGERGEQGR 
ADTAGSQLHA VPVLRVSHAE 
DEGSCTAYTL 



PGKDGVPGIR 
MGEPGVPGQS 
AGPPGPPGSV 



GAPGKEGLIG 



TEACHPFVYG 



RDGA3GKDGD RGSPGVPGSP 
DLVGEPGAKG DRGLPGPRGE 
PPGPPGVKGD LGLPGLPGAP 



2040 
2100 
2160 

2280 

2400 



3 DKGSKGEPGD KGSAGLPGLR 



PKGDRGFDGQ PGPKGDQGEK 
QKGERC-PPGE RWC-APGVPG 
HCACQGQFIA SGSRPLPSYA 
EEYQDPEAPW DSDDPCSLPL 
TREACERRCP PRWQSQGTG 



2700 
2S80 
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Coding sequenc 



ATGTCTTATC AACAGCAGCA GTGCAAGCAG CCCTGCCAGC CACCTCCTGT GTGCCCCACG 
CCAAAGTGCC CAGAGCCATG TCCACCCCCG AAGTGCCCTG AC-CCCTGCCC ACCACCAAAG 
TGTCCACAGC CCTGCCCACC TCAGCAGTGC CAGCAGAAAT ATCCTCCTGT GACACCTTCC 
CCACCCTGCC AGCCAAAGTA TCCACCGAAG AGCAAGTAA 



Nucleic Acid Accession #: NM_005629.1 
Coding sequence: 639-2546 

1 11 21 31 41 51 

I I I I I I 

TAGTCGGAGC GAGGTGGCGA GTCGCTGAGC CCGCCGCGGC CCCGAGAGCG GCTGCAGCCG 60 

CCGCCGCCGG GAAGGAGAGG GCGAGGCGCG CCCGAGCCGC CGCCGCCGCC GCCACCGCCG 120 

CCGCCGCCAC CACCGCCACC GGAGTCGCGG GCCAGCCGGG CAGCCTCCGC GGGCCCCGGC 180 

CGGGGCGGGG GGCGCGGGCC ACAGGCCCCT GCTCCGGCCG TCGTTTGCAG ACCGCGGGCG 24 0 

CCGATGTCGC CCGCGCCCCG TTAGGATGAG TCTCGGGTCG GGCGAGGAGC CGCCGCAGCC 300 

GCCGCCGCCC GAGCCGCGGG CAGGAGCCTC GGGAGCCGCC GCCGCCGCCG CCGCCGCCCG 360 

GCCGGGCCCC GACGCCGCCC GCGCGCCCCC GGGCCCCCGA CACACATGAG ATTCTTCAGG 42 0 

CTCACTTTCA AGTGCTTCGT GGACTGCTTC TGACTGCGCC GCCCGCGCCC CGCACCCCGC 4 80 

CGTCCGCCCG CCGCCCCGTC CCCCGGCCCG GCCGCCCCCC GGCCCCCGGC CGGCCCGCGC 540 

CCTCGGGGCC CTCCCCGGTG CCGCCGGTGC CCCCCGCCTG ACCGCCGCCC CCCGTGAGGC 60 0 

GCCGCGACCC CGGCCCGGCC GTGCGGCCCG CCGGGGCCAT GGCGAAGAAG AGCGCCGAGA 660 

ACGGCATCTA TAGCGTGTCC GGCGACGAGA AGAAGGGCCC CCTCATCGCG CCCGGGCCCG 72 0 

ACGGGGCCCC GGCCAAGGGC GACGGCCCCG TGGGCCTGGG GACACCCGGC GGCCGCCTGG 78 0 

CCGTGCCGCC GCGCGAGACC TGGACGCGCC AGATGGACTT CATCATGTCG TGCGTGGGCT 840 

TCGCCGTGGG CTTGGGCAAC GTGTGGCGCT TCCCCTACCT GTGCTACAAG AACGGCGGAG 900 

GTGTGTTCCT TATTCCCTAC GTCCTGATCG CCCTGGTTGG AGGAATCCCC ATTTTCTTCT 960 

TAGAGATCTC GCTGGGCCAG TTCATGAAGG CCGGCAGCAT CAATGTCTGG AACATCTGTC 1020 

CCCTGTTCAA AGGCCTGGGC TACGCCTCCA TGGTGATCGT CT7CTACTGC AACACCTAC7 10 80 

ACATCATGGT GCTGGCCTGG GGCTTCTATT ACCTGGTCAA GTCCTTTACC ACCACGCTGC 1140 

CCTGGGCCAC ATGTGGCCAC ACCTGGAACA CTCCCGACTG CGTGGAGATC TTCCGCCATG 12 0 0 

AAGACTGTGC CAATGCCAGC CTGGCCAACC TCACCTGTGA CCAGCTTGCT GACCGCCGG7 12 6 0 

CCCCTGTCAT CGAGTTCTGG GAGAACAAAG TCTTGAGGCT GTCTGGGGGA CTGGAGGTGC 132 0 
CAGGGGCCCT CAACTGGGAG GTGACCCTTT GTCTGCTGGC C ' ' ~ 
TCTGTGTCTG GAAGGGGGTC AAATCCACGG GAAAGATCGT G 

C GTGGAGTCCT GCTGCCTGGC G 

G GTCCCCTCAG GTGTG 

C CCAGATTTTC TTTTCTTACG CCATTGGCCT GGGGGCCCTC ACAGCCCTGG 1620 

C CATCATCCTG GCTCTCATCA 1680 

C CAGCTTCTTT GCTGGCTTCG TGGTCTTCTC CATCCTGGGC TTCATGGCTG 174 0 

G CGTGCACATC TCCAAGGTGG CAGAGTCAGG GCCGGGCCTG GCCTTCATCG 1800 

CCTACCCGCG GGCTGTCACG CTGATGCCAG TGGCCCCACT CTGGGCTGCC CTGTTCTTCT 1860 

TCATGCTGTT GCTGCTTGGT CTCGACAGCC 'AGTTTGTAGG TGTGGAGGGC TTCATCACCG 192 0 

GCCTCCTCGA CCTCCTCCCG GCCTCCTACT ACTTCCGTTT CCAAAGGGAG ATCTCTGTGG 1980 

CCCTCTGTTG TGCCCTCTCC TTTGTCATCG ATCTCTCCAT GGTGACTGAT GGCGGGATGT 2040 

ACGTCTTCCA GCTGTTTGAC TACTACTCGG CCAGCGGCAC CACCCTGCTC TGGCAGGCCT 210 0 

TTTGGGAGTG CGTGGTGGTG GCCTGGGTGT ACGGAGCTGA CCGCTTCATG GACGACATTG 2160 

CCTGTATGAT CGGGTACCGA CCTTGCCCCT GGATGAAATG GTGCTGGTCC TTCTTCACCC 2220 

CGCTGGTCTG CATGGGCATC TTCATCTTCA ACGTTGTGTA CTACGAGCCG CTGGTCTACA 2280 

ACAACACCTA CGTGTACCCG TGGTGGGGTG AGGCCATGGG CTGGGCCTTC GCCCTGTCCT 2 340 

CCATGCTGTG CGTGCCGCTG CACCTCCTGG GCTGCCTCCT CAGGGCCAAG GGCACCATGG 24 00 

CTGAGCGCTG GCAGCACCTG ACCCAGCCCA TCTGGGGCCT CCACCACTTG GAGTACCGAG 2460 

CTCAGGACGC AGATGT C AGG GGCCTGACCA CCCTGACCCC AGTGTCCGAG AG CAGCAAGG 2520 

TCGTCGTGGT GGAGAGTGTC ATGTGACAAC TCAGCTCACA TCACCAGCTC ACCTCTGGTA 2580 

GCCATAGCAG CCCCTGCTTC AGCCCCACCG CACCCCTCCA GGGGGCCTGC CTTTCCCTGA 2640 

CACTTTTGGG GTCTGCCTGG GGGAGGAGGG GAGAAAGCAC CATGAGTGCT CACTAAAACA 2 700 

ACTTTTTCCA TTTTTAATAA AACGCCAAAA ATATCACAAC CCACCAAAAA TAGATGCCTC 2760 

TCCCCCTCCA GCCCTAGCCG AGCTGGTCCT AGGCCCCGCC TAGTGCCCCA CCCCCACCCA 2 820 

CAGTGCTGCA CTCCTCCTGC CCCTGCCACG CCCACCCCCT GCCCACCTCT CCAC-GCTCTG 2 88 0 

CTCTGCAGCA CACCCGTGGG TGACCCCTCA CCCCAGAAGC AGCAGTGGCA GCTTGGGAAA 2940 

TGTGAGGAAG GGAAGGAGGG AGAGACGGGA GGGAGGAGAG AGAGGAGAAG GGAGGCAGGG 3000 

GAGGGGCAGC AGAACCAAGG CAAATATTTC AGCTGGGCTA TACCCCTCTC CCCATCCCTG 3 0 60 

TTATAGAAGC TTAGAGAGCC AGCCAGCAAT GGAACCTTCT GG7TCCTGCG CCAA7CGCCA 3120 

CCAGTATCAA TTGTGTGAGC TTGGGTGCGA GTGCACGCGT GCGTGAGTAC GGAGAGTATA 3180 

TATAGATCTC TATCTCTTAG CAAAGGTGAA TGCCAGATGT AAATGGCGCC TCTGGGCAAA 3240 

GGAGGCTTGT ATTTTGCACA TTTTATAAAA ACTTGAGAGA ATGAGA7TTC 7GCTTGTATA 3300 

TTTCTAAAAA GAGGAAGGAG CCCAAACCAT CCTCTCCTTA CCACTCCCAT CCCTGTGAGC 3360 

CCTACCTTAC CCCTCTGCCC CTAGCCAAGG AGTGTGAATT TATAGATCTA ACTTTCATAG 3420 

GCAAAACAAA AGCTTCGAGC TGTTGCGTGT GTGAGTCTGT TG7GTGGATG 7GCG7GTGTG 3480 

GTCCCCAGCC CCAGACTGGA TTGGAAAAGT GCATGGTGGG GGCCTCGGGG CTC-TCCCCAC 3540 
T TGCCACAAGT CTGTGGGGCA AGAGGCTG CA ATATTCCGTC CTGGGTGTCT 
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CTTCTGTGTA G CAGCTTTAA CCCACGTTTG TCTGTCACGT CCAGTCCCGA GACGGCTGAG 3780 
TGACCCCAAG AAAGGCTTCC CCGACACCCA GACAGAGGC7 GCAGGGCTGG GGCTGGGTGA 3840 
GGGTGGCGGG CCTGCGGGGA CATTCTACTG TGCTAAAAAG CCACTGCAGA CATAGCAATA 3 90 0 
AAAACAIGTC ATTTTCC 



sequer 



PCT/US02/12476 



I YSVSGDEKKG PLIAPGPDGA P 
JVKRFPY LCYKNGGGVF L 
INVWNICPLF KGLGYASMVI VFYCNTY 
CVEIFRHEDC ANASLANLTC DQLADRRSPV IEPWENKVLR LSGGLEVPGA LMEVTLCLL 
ACWVLVYFCV WKGVKSTGKI VYFTATFPYV VLWLLVRGV LLPGALDGII Y 
GSPQVWIDAG TQIFFSYAIG LGALTAL 
SILGFMAAEQ GVHISKVAES GPGLAFIAYP RAVTLMPVAP LWAALFFPML LLLC-LDSQFV 
GVEGFITGLL DLLPASYYFR FQREISVALC CALCFVIDLS MVTDGGMYVF QLFDYYSASG 
TTLLWQAFWE CVWAWVYGA DRFMDDIACM IGYRPCPWMK WCWSFFTPLV CMGIFIFNW 
YYEPLVYNNT YVYPWWGEAM GWAFALSSML CVPLHLLGCL LRAKGTMAER WQHLTQPIWG 
LHHLEYRAQD ADVRGLTTLT PVSESSKVW VESVM 



Seq ID NO: 68 DNA 3 
Coding sequence: 178-2469 



35 
40 
45 
50 
55 
60 



I I I I 

GGCACGAGGG GGACCCGGCC GGTCCGGCGC GAGCCCCCGT CCGGGGCCCT GGCTCGGCCC 
CCAGGTTGGA GGAGCCCGGA GCCCGCCTTC 
CAGTCTGGAG GGTCCACACT TGTGATTCTC 
AAAGCTAGCC CCCGTCGGCC ACTGATTCTC 
AATGCCCCAA GTGAAACATC AGAGGAGGAA 
GGAAGTGGCG 



ATCAAGATTA 
GCTAATATTC 
GGGCCCAACA 
CGGCCTCAAA 
GGACCAAAAC 
CAGAAACGGG 
TCCAACATCC 
CAAGAGATGG 
CCTTCGAGAC 
ATGGCCATGA 
ATCTATACGT 



GTGAAACATC 
AGGCCTCCAA 
TTAACCACCC 
ACAGCATCAT 
AATTCATCCT 
CCCAAACCAG 
CTGCAGCTAG 
AGACCTGTGC 
AGTGGCTTCG 



CACAGCACTG 
CATCAGCTGT 
CTATGATGCC 



GG AG CTACGG 
AATGGAGAGT 
AAAAGACGGA 
CCTAAGAGAT 

GAGTCCAACT CTTGCAAGTT 
AACACGCAAG 
ACTGCCAAGG 
GGGGGAGCCC 



ATTCATAATG 
TCCTGTTCAA 
ACAGGAGTCT 



CTTCCTAGAC C 



T GGAGACCTTG 



GGATTGAGGA 



GGCGCACGGC GGAAGATGAA 



CCACGGGTCA GCTCATACCT 



ACTGGTGTTG CAGCCCTCGG 



AAGGTGCTGC 
GAGAAACTCC 
GAAGAAATCC 
CCTCCCTTGG 
TGGGAGGATT 



TAGCTGAGGA G 



AGCGAGTCCG 



ATTGCCCCTG 



T CCTTTGCTTC 



AGCCGGTCTC 
TTCTCAGAGG 
TCTGACCCTG 
ATTAAGGAAA 



GGAGGAAACA 
GGCCCAGTAC 
CCTCCCAGCT 



3 CTCCCCGGCC 
C TCCCACCCCA 
Z GGAAATGCTT 



CCATCTTTCA 
AGACCCAAGA 
GTGATTCAAC 



AGTGGAGAGC 
AAGAGC-AATC ATCTCACTCC 
AGTCCTACAG TGGGCTTAGG 



CCAGGCTCCC 



GGCTCACGCC 
GTGCCTCTGA 
AAAGTGCTCC 
TCTCCGTCCC 
CGGAGCCACA 



TTCCCGCTGG 
CAGCTACTCC 
CTCCTCCACC 
CCCAGCCAAA 
CCCCTTGCCT 
CCCCCTTGAA 
CTTTGGCAAC 



TCCCGTTCCC 
GAGGACCTTT 
CTGTCCTCCC 
TGGATTTCAG 



GGAGCTGCTC 
AGCAGACTCC 
TAAGACACCC 
CAGAACCCCT : 
CCCAGTACAA : 



TCTTCTCCCT 



GGCTCCTCAG 
CAGATATAGA 
ATCGTTCTCT 



TTCAGAACCC 



GACAGAAGGC : 



GGCCTGGACG AGGACCCACT GGGCCCTGAC AACATCAACT GGTCCCAGTT TATTCCTGAG 
CTACAGTAGA GCCCTGCCCT TGCCCCTGTG CTCAAGCTGT CCACCATCCC GGGCACTCCA 
AGGCTCAGTG CACCCCAAGC CTCTGAGTGA GGACAGCAGG CAGGGACTGT TCTGCTCCTC 
ATAGCTCCCT GCTGCCTGAT TATGCAAAAG TAGCAGTCAC ACCCTAGCCA CTGCTGGGAC 
CTTGTGTTCC CCAAGAGTAT CTGATTCCTC TGCTGTCCCT GCCAGGAGCT GAAGGGTGGG 



AAGTCTTTTG 
AGAGTGTGGG 
CCAGGGAGAC 



CTTACCTTCC CTGATCTTTG 
TAAATGTAAG 
ACCTGGGGTT 
GGAAGACCTG CAGTGCACGG 
TGCAGGGACC CAGACAAGTG 



TATTGGGTCA GGAGTTGAAT 
TGCCCAGATG TGCGCTATTA 
TGGCATTGAC GAGAACTCAG 
GGCTTCCTTA GCTTGCCCCT 
TGGGTGTGAG CCAGCTTGAG 
AAAAAAAAAA AAAA 



GATCTGCTTG 



GATGTTTCTC 
GTGGAGGCTT 
CAGCTTTGCA 
AACACTAACT 



TAGATCATTA TCCAGAGACT 
TCTGTTCCTT GCTTTTAGTT 
GCTGAGGTAC CTGGATCTTG 
CCAGAGTCCT TTTTGCCCCT 
CTGGTTAAAA 
AGGATGGATG CAACTGAAGC 
TGATAATGTC CCCAATCATA 
GAGAAGGCCG AAAGGGCCCC 
AAGAGCCACC CTAGGCCCCA 
ACTCAATAAA AGCGAAGGTG 



2520 
2580 
2640 



214 



WO 02/086443 PCT/US02/12476 



MKASPRRPLI LKRRRLPLPV QNAPSETSEE EPKRSPAQQE SNQAEASKEV AESNSCKFPA 



50 



5 



10 LSNIQWLRKM SSDGLGSRSI KQEMEEKENC HLEQRQVKVE EPSRPSASWQ N 

YMAMIQFAIN STERKRMTLK DIYTWIEDHF PYFKHIAKPG WKNSISHNLS LKDMFVRSTS 300 

ANGKVSFKTI HPSANRYLTL DQVFKPLDPG SPQLPEHLES QQKRPNPELR RNMTIKTELP 360 

LGARRKMKPL LPRVSSYLVP IQFPVNQSLV LQPSVKVPLP LAASLMSSEL A 

I APLSSAGPGK EEKLLFGEGF SPLLPVQTIK E 

i SWEDSSQSPT PRPKKSYSGL RSP7RCVSEM LVIQKRERRE 54 0 

'SR WAAELPFPAD SSDPA3QLSY SQEVGGPFKT 60 0 

T PESWRLTPPA KVGGLDFSPV QTSQGASDPL PDPLGLMOLS 660 

TTPLQSAPPL ESPQRLLSSE PLDLISVPFG NSSPSDIDVP KPGSPEPQVS GLAANRSLTE 72 0 

GLVLDTMNDS LSKILLDISF PGLDEDPLGP DNINWSQFIP ELQ 

Seq ID NO: 70 DNA sequence 
Nucleic Acid Accession ft, BC006529.1 

1 11 21 31 41 SI 

I I I I I I 

GGCACGAGGG GGACCCGGCC GGTCCGGCGC GAGCCCC c ( lLT GGCTCGGCCC 60 

CCAGGTTGGA GGAGCCCGGA GCCCGCCTTC GGAGCTACGG CCTAACGGCG GCGGCGACTG 12 0 

CAGTCTGGAG GGTCCACACT TGTGATTCTC AATGGAGAGT GAAAACGCAG A7TCATAATG 180 

AAAACTAGCC CCCGTCGGCC ACTGATTCTC AAAAGACGGA GGCTGCCCCT TCCTGTTCAA 2 40 

AATGCCCCAA GTGAAACATC AGAGGAGGAA CCTAAGAGAT CCCCTGCCCA ACAGGAGTCT 3 00 

G AGGCCTCCAA GGAAGTGGCA GAGTCCAACT CTTGCAAGTT TCCAGCTGGG 360 

A TTAACCACCC CACCATGCCC AACACGCAAG TAGTGGCCAT CCCCAACAAT 42 0 

C ACAGCATCAT CACAGCACTG ACTGCCAAGG GAAAAGAGAG TGGCAGTAGT 480 

A AATTCATCCT CATCAGCTGT GGGGGAGCCC CAACTCAGCC TCCAGGACTC 54 0 

CGGCCTCAAA CCCAAACCAG CTATGATGCC AAAAGGACAG AAGTGACCCT GGAGACCTTG 600 

GGACCAAAAC CTGCAGCTAG GGATGTGAAT CTTCCTAGAC CACCTGGAGC CCTTTGCGAG 660 

CAGAAACGGG AGACCTGTGC AGATGGTGAG GCAGCAGGCT GCACTATCAA CAATAGCCTA 72 0 

40 TCCAACATCC AGTGGCTTCG AAAGATGAGT TCTGATGGAC TGGGCTCCCG CAGCATCAAG 780 

CAAGAGATGG AGGAAAAGGA GAATTGTCAC CTGGAGCAGC GACAGGTTAA GGTTGAGGAG 840 

CCTTCGAGAC CATCAGCGTC CTGGCAGAAC TCTGTGTCTG AGCGGCCACC CTACTCTTAC 900 

ATGGCCATGA TACAATTCGC CATCAACAGC ACTGAGAGGA AGCGCATGAC TTTGAAAGAC 960 

ATCTATACGT GGATTGAGGA CCACTTTCCC TACTTTAAGC ACATTGCCAA GCCAGGCTGG 1020 

45 AAGAACTCCA TCCGCCACAA CCTTTCCCTG CACGACATG7 TTGTCCGGGA GACGTCTGCC 10 80 

AATGGCAAGG TCTCCTTCTG GACCATTCAC CCCAGTGCCA ACCGCTACTT GACATTGGAC 114 0 

CAGGTGTTTA AGCAGCAGAA ACGACCGAAT CCAGAGCTCC GCCGGAACAT GACCATCAAA 12 0 0 

ACCGAACTCC CCCTGGGCGC ACGGCGGAAG ATGAAGCCAC TGCTACCACG GGTCAGCTCA 12 6 0 

TACCTGGTAC CTATCCAGTT CCCGGTGAAC CAGTCACTGG TGTTGCAGCC CTCGG7GAAG 13 2 0 

GTGCCATTGC CCCTGGCGGC TTCCCTCATG AGCTCAGAGC TTGCCCGCCA TAGCAAGCGA 1380 

GTCCGCATTG CCCCCAAGGT GCTGCTAGCT GAGGAGGGGA TAGCTCC7CT TTCTTCTGCA 1440 

GGACCAGGGA AAGAGGAGAA ACTCCTGTTT GGAGAAGGGT TTTCTCCTTT GCTTCCAGTT 1500 

CAGACTATCA AGGAGGAAGA AATCCAGCCT GGGGAGGAAA TGCCACACTT AGCGAGACCC 1560 

ATCAAAGTGG AGAGCCCTCC CTTGGAAGAG TGGCCCTCCC CGGCCCCATC T 



TACAGTGGGC TTAGGTCCCC AACCCGGTGT GTCTCGGAAA T 

GAGAGGAGGG AGAGGAGCCG GTCTCGGAGG AAACAGCATC TACTGCCTCC CTGTGTGGAT 18 0 0 

GAGCCGGAGC TGCTCTTCTC AGAGGGGCCC AGTACTTCCC GCTGGGCCGC AGAGCTCCCG 1860 

TTCCCAGCAG ACTCCTCTGA CCCTGCCTCC CAGCTCAGCT ACTCCCAGGA AGTGGGAGGA 1920 

60 CCTTTTAAGA CACCCATTAA GGAAACGCTG CCCATCTCCT CCACCCCGAG CAAATCTGTC 198 0 

CTCCCCAGAA CCCCTGAATC CTGGAGGCTC ACGCCCCCAG CCAAAGTAGG GGGACTGGAT 2 04 0 

TTCAGCCCAG TACAAACCCC CCAGGGTGCC TCTGACCCCT TGCCTGACCC CCTGGGGCTG 2100 

ATGGATCTCA GCACCACTCC CTTGCAAAGT GCTCCCCCCC TTGAATCACC GCAAAGGCTC 2160 

CTCAGTTCAG AACCCTTAGA CCTCATCTCC GTCCCCTTTG GCAACTCTTC TCCCTCAGAT 222 0 

65 ATAGACGTCC CCAAGCCAGG CTCCCCGGAG CCACAGGTTT CTGGCCTTGC AGCCAATCGT 22 80 

TCTCTGACAG AAGGCCTGGT CCTGGACACA ATGAATGACA GCCTCAGCAA GATCCTGCTG 23 40 

GACATCAGCT TTCCTGGCCT GGACGAGGAC CCACTGGGCC CTGACAACAT CAACTGGTCC 24 0 0 

CAGTTTATTC CTGAGCTACA GTAGAGCCCT GCCCTTGCCC CTGTGCTCAA GCTGTCCACC 24 60 

ATCCCGGGCA CTCCAAGGCT CAGTGCACCC CAAGCCTCTG AGTGAGGACA GCAGGCAGGG 2520 

70 ACTGTTCTGC TCCTCATAGC TCCCTGCTGC CTGATTATGC AAAAGTAGCA GTCACACCCT 2580 

AGCCACTGCT GGGACCTTGT GTTCCCCAAG AGTATCTGAT TCCTCTGCTG TCCCTGCCAG 2 64 0 

GAGCTGAAGG GTGGGAACAA CAAAGGCAAT GGTGAAAAGA GATTAGGAAC CCCCCAGCCT 2 700 

GTTTCCATTC TCTGCCCAGC AGTCTCTTAC CTTCCCTGAT CTTTGCAGGG TGGTCCGTGT 2760 

AAATAGTATA AATTCTCCAA ATTATCCTCT AATTATAAAT GTAAGCTTAT TTCCTTAGAT 2 82 0 

75 CATTATCCAG AGACTGCCAG AAGGTGGGTA GGATGACCTG GGGTTTCAAT TGACTTCTGT 2880 

TCCTTGCTTT TAGTTTTGAT AGAAGGGAAG ACCTGCAGTG CACGGTTTCT TCCAGGCTGA 2 94 0 

GGTACCTGGA TCTTGGGTTC TTCACTGCAG GGACCCAGAC AAGTGGATCT GCTTGCCAGA 3000 

GTCCTTTTTG CCCCTCCCTG CCACCTCCCC GTGTTTCCAA GTCAGCTTTC CTGCAAGAAG 3060 

AAATCCTGGT TAAAAAAGTC TTTTGTATTG GGTCAGGAGT TGAATTTGGG GTGGGAGGAT 312 0 

80 GGATGCAACT GAAGCAGAGT GTGGGTGCCC AGATGTGCGC TATTAGATGT TTCTCTGATA 3180 

ATGTCCCCAA TCATACCAGG GAGACTGGCA TTGACGAGAA CTCAGC-TGGA GGCTTGAGAA 3240 

GGCCGAAAGG GCCCCTGACC TGCCTGGCTT CCTTAGCTTG CCCCTCAGCT TTGCAAAC-AG 3 3 00 

CCACCCTAGG CCCCAGCTGA CCGCATGGGT GTGAGCCAGC TTGAGAACAC 7AAC7ACTCA 3360 
A AGGTGGAAAA AAAAAAAAAA AAAAAAA 



215 



WO 02/086443 PCT/US02/12476 

1 11 21 31 41 51 

I I I I I I 

MKTSPRRPLI LKRRRLPLPV QNAPSETSEE EPKRSPAQQE 3NQASASKEV AESNSCKFPA SO 

GIKI INHPTM PNTQWAIPN NANIHSIITA LTAKGKESGS SGPNKPILIS CGGAPTQPPG 120 
LRPQTQTSYD AKRTEVTLET LGPKPAARDV NLPRPPGALC BQKRETC 

LSNIQWLRKM SSDGLGSRSI KQEMEEKENC HLEQRQVKVE EPSRPSASWQ N 
YMAMIQPAIN STERKRMTLK DIYTWIEDHF PYPKHIAKPG WKNSIRE 

ANGKVSFWTI HPSANRYLTL DQVFKQQKRP NPELRRNMTI KTELPLGARR K 
SYLVPIQFPV NQSLVLQPSV KVPLPLAASL MSSELARHSK RVRIAPI 

AGPGKEEKLL PGEGPSPLLP VQTIKEEEIQ PGEEMPHLAR PIKV3SPPLE EWPSPAPSFK 

EESSHSWEDS SQSPTPRPKK SYSGLRSPTR CVSEMLVIQH RERRERSRSR RKQHLLPPCV 

DEPELLFSEG PSTSRWAAEL PFPADSSDPA SQLSYSQEVG GPFKTPIKET LPISSTPSKS 

VLPRTPESWR LTPPAKVGGL DFSPVOTPQG ASDPLPDPLG LMDL3TTPLQ SAPPLESPQR 

LLSSEPLDLI SVPFGNSSPS DIDVPKPGSP EPQVSGLAAN RSLTEGLVLD TMNDSLSKIL 
LDISFPGLDE DPLGPDNINW SQFIPELQ 

Seq ID NO: 72 DNA sequence 
Nucleic Acid Accession 0; U74612.1 
Coding sequence: 178-2583 

1 11 21 31 41 51 

I I I I I I 



CCAGGTIGGA GGAGCCCGGA GCCCGCCTTC GGAGCTACGG CCTAACGGCG GCGGCGACTG 
CAGTCTGGAG GGTCCACACT TGTGATTCTC AATGGAGAGT GAAAACGCAG ATTCATAATG 
AAAACTAGCC CCCGTCGGCC ACTGATTCTC AAAAGACGGA GGCTGCCCCT TCCTGTTCAA' 
AATGCCCCAA GTGAAACATC AGAGGAGGAA CCTAAGAGAT CCCCTGCCCA ACAGGAGTCT 



30 AATCAAGCAG AGGCCTCCAA GGAAGTGGCA GAGTCCAACT CTTGCAAGTT TCCAGCTGGG 360 

ATCAAGATTA TTAACCACCC CACCATGCCC AACACGCAAG TAGTGGCCAT CCCCAACAAT 420 

GCTAATATTC ACAGCATCAT CACAGCACTG ACTGCCAAGG GAAAAGAGAG TGGCAGTAGT 480 

GGGCCCAACA AATTCATCCT CATCAGCTGT GGGGGAGCCC CAACTCAGCC TCCAGGACTC 540 

CGGCCTCAAA CCCAAACCAG CTATGATGCC AAAAGGACAG AAGTGACCCT GGAGACCTTG 600 

35 GGACCAAAAC CTGCAGCTAG GGATGTGAAT CTTCCTAGAC CACCTGGAGC CCTTTGCGAG 660 

CAGAAACGGG AGACCTGTGC AGATGGTGAG GCAGCAGGCT GCACTATCAA CAATAGCCTA 720 

TCCAACATCC AGTGGCTTCG AAAGATGAGT TCTGATGGAC TGGGCTCCCG CAGCATCAAG 780 

CAAGAGATGG AGGAAAAGGA GAATTGTCAC CTGGAGCAGC GACAGGTTAA GGTTGAGGAG 84 0 

CCTTCGAGAC CATCAGCGTC CTGGCAGAAC TCTGTGTCTG AGCGGCCACC CTACTCTTAC 900 

40 ATGGCCATGA TACAATTCGC CATCAACAGC ACTGAGAGGA AGCGCATGAC TTTGAAAGAC 960 

ATCTATACGT GGATTGAGGA CCACTTTCCC TACTTTAAGC ACATTGCCAA GCCAGGCTGG 1020 
AAGAACTCCA TCCGCCACAA CCTTTCCCTG CACGACATGT TTGTCCC 
AATGGCAAGG TCTCCTTCTG GACCATTCAC CCCAG7GCCA ACCGCTACTT G 
CAGGTGTTTA AGCCACTGGA CCCAGGGTCT CCACAATTGC CCGAGCJ 

45 CAGAAACGAC CGAATCCAGA GCTCCGCCGG AACATGACCA TCAAAACCGA A 

GGCGCACGGC GGAAGATGAA GCCACTGCTA CCACGGGTCA GCTCATACCT GGTACCTATC 
CAGTTCCCGG TGAACCAGTC ACTGGTGTTG CAGCCCTCGG TGAAGGTGCC A 
GCGGCTTCCC TCATGAGCTC AGAGCTTGCC CGCCATAGCA AGCGAG 1 ] 
AAGGTTTTTG GGGAACAGGT GGTGTTTGGT TACATGAGTA AGTTCT1 

50 CGAGATTTTG GTACACCCAT CACCAGCTTG TTTAATTTTA TCTTTCTTTG TTTATCAGTG 1560 

CTGCTAGCTG AGGAGGGGAT AGCTCCTCTT TCTTCTGCAG GACCAGGGAA AGAGGAGAAA 1620 

CTCCTGTTTG GAGAAGGGTT TTCTCCTTTG CTTCCAGTTC AGACTATCAA GGAGGAAGAA 1680 

ATCCAGCCTG GGGAGGAAAT GCCACACTTA GCGAGACCCA TCAAAGTGGA GAGCCCTCCC 1740 

TTSGAAGAGT GGCCCTCCCC GGCCCCATCT TTCAAAGAGG AATCATCTCA CTCCTGGGAG 18 00 



ACCCGGTGTG TCTCGGAAAT GCTTGTGATT CAACACAGGG A 
TCTCGGAGGA AACAGCATCT ACTGCCTCCC TGTGTGGATG AGCCGGAGCT G 

GAGGGGCCCA GTACTTCCCG CTGGGCCGCA GAGCTCCCGT TCCCAGCAGA CTCCTCTGAC 204 0 

CCTGCCICCC AGCTCAGCTA CTCCCAGGAA GTGGGAGGAC CTTTTAAGAC ACCCATTAAG 2100 

60 GAAACGCTGC CCATCTCCTC CACCCCGAGC AAATCTGTCC TCGCCAGAAC CCCTGAATCC 2160 

TGGAGGCTCA CGCCCCCAGC CAAAGTAGGG GGACTGGATT TCAGCCCAGT ACAAACCTCC 2220 

CAGGGTGCCT CTGACCCCIT GCCTGACCCC CTGGGGCTGA TGGATCTCAG CACCACTCCC 2280 

TTGCAAAGTG CTCCCCCCCT TGAATCACCG CAAAGGCTCC TCAGTTCAGA ACCCTTAGAC 2340 

CTCATCTCCG TCCCCTTTGG CAACTCTTCT CCCTCAGATA TAGACGTCCC CAAGCCAGGC 2400 

65 TCCCCGGAGC CACAGGTTTC TGGCCTTGCA GCCAATCGTT CTCTGACAGA AGGCCTGGTC 2460 

CTGGACACAA TGAATGACAG CCTCAGCAAG ATCCTGCTCG ACATCAGC7T TCCTGGCCTG 2520 

GACGAGGACC CACTGGGCCC TGACAACATC AACTGGTCCC AGTTTATTCC TGAGCTACAG 2580 

TAGAGCCCTG CCCTTGCCCC TGTGCTCAAG CTGTCCACCA TCCCGGGCAC TCCAAGGCTC 2640 

AGTGCACCCC AAGCCTCTGA GTGAGGACAG CAGGCAGGGA CTGTTCTGCT CCTCATAGCT 2700 

70 CCCTGCTGCC TGATTATGCA AAAGTAGCAG TCACACCCTA GCCACTGCTG GGACCT7GTG 2760 

TTCCCCAAGA GTATCTGATT CCTCTGCTGT CCCTGCCAGG AGCTGAAGGG TGGGAACAAC 282 0 

AAAGGCAATG GTGAAAAGAG ATTAGGAACC CCCCAGCCTG TTTCCATTCT CTGCCCAGCA 2880 

GTCTCTTACC TTCCCTGATC TTTGCAGGGT GGTCCGTGTA AATAGTATAA ATTCTCCAAA 2940 

TTATCCTCTA ATTATAAATG TAAGCTTATT TCCTTAGATC ATTATCCAGA GACTGCCAGA 3000 

75 AGGTGGGTAG GATGACCTGG GGTTTCAATT GACTTCTGTT CCTTGCTT7T AGTTTTGATA 3060 

GAAGGGAAGA CCTGCAGTGC ACGGTTTCTT CCAGGCTGAG GTACCTGGAT CTTGGGTTCT 312 0 

TCACTGCAGG GACCCAGACA AGTGGATCTG CTTGCCAGAG TCCTTTTTGC CCCTCCCTGC 3180 

CACCTCCCCG TGTTTCCAAG TCAGCTTTCC TGCAAGAAGA AATCCTGGTT AAAAAAGTCT 3240 

TTTGTATTGG GTCAGGAGTT GAATTTGGGG TGGGAGGATG GATGCAACTG AAGCAGAGTG 3300 

80 TGGGTGCCCA GATGTGCGCT ATTAGATGTT TCTCTGATAA TGTCCCCAAT CATACCAGGG 3360 

AGACTGGCAT TGACGAGAAC TCAGGTGGAG GCTTGAGAAG GCCGAAAGGG CCCCTGACCT 3420 

GCCTGGCTTC CTTAGCTTGC CCCTCAGCTT TGCAAAGAGC CACCCTAGGC CCCAGCTGAC 3480 

CGCATGGGTG TGAGCCAGCT TGAGAACACT AACTACTCAA TAAAAGCGAA GGTQGACAAA 3540 
AAAAAAAAAA AAAAA 
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i i 1 r i 1 r r 

MKTSPRRPLI LKRRRLPLPV QNAPSETSEE EPKRSPAQQE SNQAEASKEV AESNSCKFPA SO 

5 GIKI INHPTM PNTQWAIPN NANIHSIITA LTAKGKESGS SGPNKFILIS CGGAPTQPPG 120 

LRPQTQTSYD AKRTEVTLET LGPKPAARDV NLPRPPGALC EQKRETCADC- EAAC-CTINNS 180 

LSNIQWLRKM SSDGLGSRSI KQEMEEKENC HLEQRQVKVE EPSRPSASWQ NSVSERPPYS 240 

YMAMIQFAIN STERKRMTLK DIYTWIEDHF PYFKHIAKPG WKNSIRHNLS LHDMFVRETS 3 00 

ANGKVSFWTI HPSANRYLTL DQVFKPLDPG SPQLPEHLES QQKRPNPELR RNMTIKTELP 3S0 

10 LGARRKMKPL LPRVSSYLVP IQFPVNQSLV LQPSVKVPLP LAASLMSS3L ARHSKRVRIA 420 

PKVFGEQWF GYMSKFFSGD LRDFGTPITS LFNFIFLCLS VLLA3EGIAP LSSAGPGKEE 480 

KLLFGEGFSP LLPVQTIKEE EIQPGEEMPH LARPIKVESP PLEEWPSPAP SFKEES3HSW 540 

EDSSQSPTPR PKKSYSGLRS PTRCVSEMLV IQHRERRERS RSRRKQHLLP PCVDEP3LLF 600 

SEGPSTSRWA AELPFPADSS DPASQLSYSQ EVGGPFKTPI KETLPISSTP SKSVLPRTPE SSO 

15 SWRLTPPAKV GGLDFSPVQT SQGASDPLPD PLGLMDLSTT PLQSAPPLES PQRLLSSEPL 72 0 

DLISVPEGNS SPSDIDVPKP GSPEPQVSGL AANRSLTEGL VLDTMNDSLS K 

LDEDPLGPDN INWSQFIPEL Q 

Seq ID NO: 74 DNA sequence 
20 Nucleic Acid Accession ft: Ec 
Coding sequence: 111-416 

1 11 21 31 41 SI 

„- I I I I I I 

Zj GGGAAGAGCC AGGCTGAGCC TTATAAAGGA CTGCTCTTTG TCCAAACACA CACATCTCAC 

TCATCCTTCT ACTOGTGACG CTTCCCAGCT CTGGCTTTTT GAAAGCAAAG ATGAGCAACA 

CTCAAGCTGA GAGGTCCATA ATAGGCATGA TCGACATGTT TCACAAATAC ACCAGACGTG 

ATGACAAGAT TGAGAAGCCA AGCCTGCTGA CGATGATGAA GGAGAACTTC CCCAACTTCC 

TTAGTGCCTG TGACAAAAAG GGCACAAATT ACCTCGCCGA TGTCTTTGAG AAAAAGGACA 
30 AGAATGAGGA TAAGAAGATT GATTTTTCTG AGTTTCTGTC CTTGCTGGGA GACATAGCCA 

CAGACTACCA CAAGCAGAGC CATGGAGCAG CGCCCTGTTC CGGGGGCAGC CAGTGACCCA 

GCCCCACCAA TGGGCCTCCA GAGACCCCAG GAACAATAAA ATGTCTTCTC CCACCAGA 

Seq ID NO: 75 Protein sequence: 



Seq ID NO: 76 DNA sequence 
Nucleic Acid A 
Coding ss 



I I I I I I 

GGGAAGAGCC AGGCTGAGCC TTATAAAGGA CTGCTCTTTG TCCAAACACA CACATCTCAC 
50 TCATCCTTCT ACTCGTGACA CTTCCCAGTT CTGGCTTTTT GAAAGCAAAG ATGAGCAACA 

CTCAAGCTGA GAGGTCCATA ATAGGCATGA TCGACATGTT TCACAAATAC ACCGGACGTG 
ATGGCAAGAT TGAGAAGCCA AGCCTGCTGA CGATGATGAA GGAGAACTTC CCCAATTTCC 
TCAGTGCCTG TGACAAAAAG GG CAT ACATT ACCTCGCCAC TGTCTTTGAG AAAAAGGACA 
AGAATGAGGA TAAGAAGATT GATTTTTCTG AGTTTCTGTC C ~ ~~ 
55 CAGACTACCA CAAGCAGAGC CATGGAGCGG CGCCCTGTTC T 
GCCCCACCAA GGGGCCTCCA GAGACCCCAG GAACAATAAG T 



1 ID NO: 77 Protein sequence: 



Seq ID NO: 78 DNA sequence 
Nucleic Acid Accession ft: Z73678.1 
Coding sequence: 253-2433 

1 11 21 31 41 51 

I I I I I I 

GGGGTGGTGC AGGGCAGGGG TGGTATATCC TGTCTGACGG AGGGCC 3C TCGC AGTGC 
CAGAGAGGGA CGAACCAGGG TGGAAGCGCC AGGAGCAGCT GCAGGGAGCC CTCACGCGGA 
CCTCGCACTC TATGGCCGTA GGGAGCCGCT GAGAGCGAGA AGAGCACGCT CCTGCCCGCC 
CGCTGCACCG CACCTCGCCT CGCCTCTCTG CTCTCCTAGG CCCC3GCCGC GCGCCACCCG 
CCTCCCGCCA CCATGAACCA CTCGCCGCTC AAGACCGCCT TGGCGTACGA ATGCTTCCAG 
GACCAGGACA ACTCCACGTT GGCTTTGCCG TCGGACCAAA AGATGAAAAC AGGCACC-TCT 
GGCAGGCAGC GCGTGCAGGA GCAGGTGATG ATGACCGTCA AGCGGCAGAA GTCCAAC-TCT 
TCCCAGTCGT CCACCCTGAG CCACTCCAAT CGAGGTTCCA TGTATGATGG CTTGGCIGAC 
AATTACAACT ATGGGACCAC CAGCAGGAGC AGCTACTACT CCAAGTTCCA GG C AGG G A AT 
GGCTCATGGG GATATCCGAT CTACAATGGA ACCCTCAAGC GGGAGCCTC-A CAACAGC-CGC 
TTCAGCTCCT ACAGCCAGAT GGAGAACTGG AGCCGGCACT ACCCCCGGGG CAGCTGTAAC 
ACCACCGGCG CAGGCAGCGA CATCTGCTTC ATGCAGAAAA TCAAGGCGAG CCC-CAGTGAG 
CCCGACCTCT ACTGTGACCC ACGGGGCACC CTGCGCAAGG GCACC-CTGGG CAC-CAAGGGC 
CAGAAGACCA CCCAGAACCG CTACAGCTTT TACAGCACCT GCAGTGGTCA GAAGGCCATA 
AAGAAGTGCC CTGTGCGCCC GCCCTCTTGT GCCTCCAAGC AGGACCCTC-T GTATATCCCG 
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CCCATCICCT GCAACAAGGA CCTGTCCTTT GGCCACTCTA GGGCCAGCTC CAAGATCTGC 9 60 

A TCGAGTGCAG TGGGCTGACC ATCCCCAAGG CTGTC-CAGTA CCTGAGCTCC 1020 

•CCAGGC CATTGGGGCC TATTACATCC AGCATACCTG CTTCCAGGAT 10 BO 
GAATCTGCCA AGCAACAGGT CTATCAGCTG GGAGGCATCT GCAAC-CTGGT GGACCTCCTC " 1140 

CGCAGCCCCA ACCAGAACGT CCAGCAGGCC GCGGCAGGGG CCCTC-CGCAA CCTGGTC-TTC 12 00 

AGGAGCACCA CCAACAAGCT GGAGACCCGG AGGCAGAATG GGATCCGCGA GGCAGTCAGC 12 SO 

CTCCTGAGGA GAACCGGGAA CGCCGAGATC CAGAAGCAGC TGACTGGGCT GCTCTGGAAC 1320 

CTGTCTTCCA CTGACGAGCT GAAGGAGGAA CTCATTGCCG ACGCCCTGCC TGTTCTGGCC 13 80 

GACCGCGTCA TCATTCCCTT CTCTGGCTGG TGCGATGGCA ATAGCAACAT GTCCCG3GAA 1440 

GTGGTGGACC CTGAGGTCTT CTTCAATGCC ACAGGCTGCT TGAGC-AACCT GAC-CTCGGCC 15 00 

GATGCAGGCC GCCAGACCAT GCGTAACTAC TCAGGGCTCA T7GATTCCCT CATGGCCTAT 15 SO 

GTCCAGAACT GTGTAGCGGC CAGCCGCTGT GACGACAAGT CTGTGGAAAA CTGCATGTGT 1S20 

GTTCTGCACA ACCTCTCCTA CCGCCTGGAC GCCGAGGTGC CCACCCGCIA CCGCCAGCTG 1660 

GAGTATAACG CCCGCAACGC CTACACCGAG AAGTCCTCCA CTGGCTGCTT CAGCAACAAG 1740 

AGCGACAAGA TGA1GAACAA CAACTATGAC TGCCCCCTGC CTGAGGAAGA GACCAACCCC 1800 

AAGGGCAGCG GCTGGTTGTA CCATTCAGAT GCCATCCGCA CCTACCTGAA CCTCATGGGC I860 

AAGAGCAAGA AAGATGCTAC CCTGGAGGCC TGTGCTGGTG CCCTGCAGAA CCTGACAGCC 1920 

AGCAAGGGGC TGATGTCCAG TGGCATGAGC CAGTTGATTG GGCTGAAGGA AAAGGGCCTG 1980 

CCACAAATTG CCCGCCTCCT GCAATCTGGC AACTCTGATG TGGTGCGGTC CGC-AGCCTCC 2040 

CTCCTGAGCA ACATGTCCCG CCACCCTCTG CTGCACAGAG TGATGGGGAA CCAGGTGTTC 2100 

CCGGAGGTGA CCAGGCTCCT CACCAGCCAC ACTGGCAATA CCAGCAACTC CGAAGACATC 2160 

TTGTCCTCGG CCTGCTACAC TGTGAGGAAC CTGATGGCCT CGCAGCCACA ACTGGCCAAG 2220 

CAGTACTTCT CCAGCAGCAT GCTCAACAAC ATCATCAACC TGTGCCGAAG CAGTGCCTCA 2280 

CCCAAGGCCG CAGAAGCTGC CCGGCTTCTC CTGTCTGACA TGTGGTCCAG CAAGGAACTG 2340 

CAGGGTGTCC TCAGACAGCA AGGTTTCGAT AGGAACATGC TGG3AACCTT AGCTGGGGCC 2400 

AACAGCCTCA GGAACTTCAC CTCCCGATTC TAAGAAGAGA CTGTCCAAGC AAGTTAGGCT 2 4 SO 

TGCAGGAAGA TATGACCCAG CTGAGAAGCC CTCAGGCC7C GCTGC-ATGGG GTTTTCTGTC 2520 

CATCC1GTGC AGTATTTGGG AAAGTTCACA AGAAACTGAG AAGAAACCTA AAAACTGTGG 2 580 

ATAGTGGAAA GATTTTTAGA TTTTTTTTTT CCTTGGGGAA ACTGGCAGGC AATGGGGGTT 2S40 

AGGGAGGTTG GGGCGGGGGG GGCTTTCTTG AGTTAAAGGG GCTTATATGT GATGTCAATA 2 700 

TTTCTTCCTC TGAGAAATGG TATATATATG TGTCTAATGT AAGTGTGTGC ATGCATGTGC 2 760 

GCGTGCATGT GTGTGTGTGT GAGTGTCTTA AAGCATAACC ACAAACTGCA AAAAGCTAGG 2820 

TAAGCTATTT TGTTGCAGCT CATAAGGTGG TGAAAAGGAC TCTCCTGTGT TTCTTACTCA 2 880 

TAGGCAAGGA CAACATGTGC TTTTTGGTGA GCTGCTCATA ATTCCTGAAA TGTGTGGTGC 2 940 

CAGGGCAAGG GGGCCATCAC TGCAGTCAGG CCCTCAGAGG AGTCCTGCAG GCTTCCTACC 3 000 

AGTGGTCTCC AAGGGTGCAG GAGTAACTGG GGCTGGGCCA GCCTCCCCCC TTACAAGGCT 3060 

GCTTTCCACG AAGGG AGGT C TGGTGTATCT CATGGGAGAA TCTGGGGTGT CTGTAGTGTC 3120 

ACCCCTCCAG CAGCGCCACA AGGACTGAGG TTGGGTAGGT GTGAGGTTCC AGAGGACAGC 3180 

AGGACACTCT CGCATACTTT GCCAAATGAG GCCTGCTCAG AGGAG7AGGA GCTGAAAGAT 3240 

GGTGCCTTCC ACCCTCTTGG GCTGTGTGCC CATCAGAGCA GGCTCAGCCT GCAAAGGCCC 3 300 

TGCATTCAGA GGTCTTGTAA TCTACTTGTT GCAGGAGAAA GAAGGTAAAA AATGATTTTT 3360 

TTAAGAAAAG CTATTTTATT GCAGCTCTTT CCCAAGAGCT GTTCTGGGAA TGGCTGGTCT 3420 
TCATATTCCC AGTGGAGAGG GGAACAAGTG GGGCTGGGCA TATACCTATT C 
GTGGGATGGA GTTGGGGTAT AGAAATTAAC CAGGAAGATG TTTCCACCAA 
AGTCAATTGA GGGAGTGTTT GGGTCCCAGG AGACTTGGAC GGGGGGAGT7 

CCATGCCCCA CTTCCCCTGA CCCCAGCTGT CTTGTCTCCA CTCTGTGAAA CCCACAGGGG 3 72 0 

?i CAGGGCTATT AGGGGTATCA GCCACGTCGA GCCCCCAGAC TCTGTGCACT 3 780 

1GCAGGA GGGCTCCCGA GGGCCTTATG AGAAAACCTG TGTGGACATC 3 84 0 

CCTTGGTGTA CACTAAGACA GAGCAGAGCC CAGCGCTCCC AAGCCTTCCT CCTTCCAGCT 3 900 

TCTACCTCCA TGCTAGCATT GCTGGTGTTA GAGAGGAATT AACTTCCTGG TCTGTGCCCT 3960 

TCTCTAGAAG AATATAAGAT GCTCCTCCTC CTCACCCCTT CTCAGCCTCC TCCCAAGTCT 4 020 

TCCTCTTCTG CACCACCCCC GAGTCCAAAC CCACCTCTTG CCCCAGCATT CAGGCTGGAA 4 080 

AACACTGATG TGGACTCAGT ATGACAACTG AGATGGGGGA AGCCAGACAT GTGAGGACGC 4140 

TGTCCTCCGA GAGGTGTCCC CGGCTGTTAG CCAGCTGTGC TGTGGTGCTG TGGGTCTGTC 4200 

ATACCCTCCC TTGCTTCTGT TCACACTGGG AGGCCCACTC CTGGCTCACC TCTCCCTCTC 4260 

AGGGACCCAC GTGGGAGCCT GGATCCCTGG ACTGTCCTGG GCATAGGTTT CAGGGGCCTC 4320 

CTTTGTTGTC ATCAGAACCC AG AGGAATT C TTCTCCTAAA AAATACGTAT GGCATACCAA 4380 

TCTGTGCGGG GCAGTGTCCT AAGCACTTAG ACTACATCAG GGAAGAACAC AGACCACATC 4440 

CCCGTCCTCA TGCGGCTTAT GTTTTCTGGA GGAAAGTGGA GACACAAGTC CITGGCTTTA 4500 

GGGCTCCCCC GGCTGGGGGC TGTGCAGTCC GGTCAGGGCG GGAGGGGAAA TGCACCGCTG 4560 

CATGTGAACC TTACCAGCCC AGGCGGATGC CCCTTCCCCT TAGCACTACC CTGGCCTCCT 4 620 

GCATCCCCTC GCCTCATGTT CCTCCCACCT TCAAAGAATG AAGAGCCCCA TGGGCCCAGC 4S80 

CCCTGCCCTG GGAACCAGGC AGCCTTCCAG ACCTCAGGGG CTGAGGCAGA CTATTAGGGC 4740 

AGGGCTGACT TTGGTGACAC TGCCCATTCC CTCTCAGGCC AGCTCAGGTC ACCCGGGCCT 4800 

CTGACCCAGG CCTGTCACTT TGAGAGGGGC AAAACTGAGA GGGGCTTTTC CTAGAGAAAG 4860 

AGAACAAGGA GCTTGCCAGG CTTCATGTAG CCGACACACG TCTCAGGATT TTAAGTCCAC 4920 

ATTGGCCTCA CACTAGCCTA GGCCAATGCC CAAAATAAGG AGTTCCAATT TGGGGCCAAA 4 980 

TGAGGAAGGA CACAGACTCT GCCCTGGGAT CTCCTGTGCT AGCGGCCAAT GACAAATCCA 504 0 

GTCATTGGCC ACCAGCCACC TCTGCAGTGG GGACCACACT AGCAGCCCTG ACTCCACACT 5100 

CCTCCTGGGG ACCCAAGAGG CAGTGTTGCT GTCTGCGTGT CCACCTTGGA ATCTGGCTGA 5160 

ACTGGCTGGG AGGACCAAGA CTGCGGCTGG GGTGGGCAGG GAAGGGAAGC CGGGGGCTGC 5220 

TGTGAGGGAT CTTGGAGCTT CCCTGTAGCC CACCTTCCCC TTGCTTCATG TTTGTAGAGG 5280 

AACCTTGTGC CGGCCAGGCC CAGTTTCCTT GTGTGATACA CTAATGTATT TGCTTTTTTT 5340 



j ID NO: 79 Prot 



I I I I I I 

MHHSPLKTAL AYECFQDQDN STLALPSDQK MKTGTSGRQR VQEQVMMTVK RQKSKSSQSS 

TLSHSNRGSM YDGLADNYNY GTTSRSSYYS KFQAGNGSWG YPIYNGTLXR EPDNRRFSSY 

SQMBNWSRHY PRGSCNTTGA GSDICFMQKI KASRSEPDLY CDPRGTLRKG ILGSKGQKTT 

QNRYSFYSTC SGQKAIKKCP VRPPSCASKQ DPVYIPPISC NKDLSFGHSR ASSKICSEDI 

ECSGLTIPKA VQYLSSQDEK YQAIGAYYIQ HTCFQDESAK QQVYQLGGIC KLVDLLRSPN 

QNVQQAAAGA LRNLVFRSTT NKLETRRQNG IREAVSLLRR TGKAEIQKQL TGLLWNLSST 
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DELKEBLIAD ALPVLADRVI IPPSGWCDGN SNMSREWDP EVFFNATGCL RNLSSADAGR 420 

QTMRNYSGLI DSLMAYVQNC VAASRCDDKS VENCMCVLHN LSYRLDAEVP TRYRQLEYNA 480 

RNAYTEKSST GCFSNKSDKM MHNNYDCPLP EEETNPKG3G WLYKSDAIR7 YUILMGKSKK 540 

DATLEACAGA LQNLTASKGL MSSGMSQLIG LKEKGLPQIA RLLQSGNSDV VRSGASLLSN 60 0 

MSRHPLLHRV MGNQVFPEVT RLLTSHTGNT SNSEDILSSA CYTVRNLMAS QPQLAKQYFS 660 

SSMLNKIINL CRSSASPKAA EAARLLLSDM WSSKELQGVL RQQGFDRKML GTLAGANSLR 720 
NFTSRF 



I I I I I I 

TAGTCGCGGG TCCCCGAGTG AGCACGCCAG GGAGCAGGAG ACCAAACGAC GGGGGTCGGA 60 

GTCAGAGTCG CAGTGGGAGT CCCCGGACCG GAGCACGAGC CTC-AGCGGGA GAGCGCCGCT 120 

CGCACGCCCG TCGCCACCCG CGTACCCGGC GCAGCCAGAG CCACCAGCGC AGCGCTGCCA 180 

TGGAGCCCAG CAGCAAGAAG CTGACGGGTC GCCTCATU T CTGTGGGA CGAGCAGTGC 240 

TTGGCTCCCT GCAGTTTGGC TACAACACTG GAGTCATCAA TGCCCCCCAG AAGGTGATCG 300 

AGGAGTTCTA CAACCAGACA TGGGTCCACC GCTATGGGGA GAGCATCCTG CCCACCACGC 360 

TCACCACGCT CTGGTCCCTC TCAGTGGCCA TCTTTTCTGT TGGGGGCATG ATTGGCTCCT 420 

TCTCTGTGGG CCTTTTCGTT AACCGCTTTG GCCGGCGGAA TTCAATGCTG ATGATGAACC 480 

TGCTGGCCTT CGTGTCCGCC GTGCTCATGG GCTTCTCGAA ACTGGGCAAG TCCTTTGAGA 540 

TGCTGATCCT GGGCCGCTTC ATCATCGGTG TGTACTGCGG CCTGACCACA GGCTTCGTGC 60 0 

CCATGTATGT GGGTGAAGTG TCACCCACAG CCTTTCGTGG GGCCCT3GGC ACCCTGCACC 660 

AGCTGGGCAT CGTCGTCGGC ATCCTCATCG CCCAGGTGTT CGGCCTGGAC TCCATCATGG 72 0 

GCAACAAGGA CCTGTGGCCC CTGCTGCTGA GCATCATCTT CATCCCGGCC CTGCTGCAGT 780 

GCATCGTGCT GCCCTTCTGC CCCGAGAGTC CCCGCTTCCT GCTCATCAAC CGCAACGAGG 840 

AGAACCGGGC CAAGAGTGTG CTAAAGAAGC TGCGCGGGAC AGCTGACGTG ACCCATGACC 900 

TGCAGGAGAT GAAGGAAGAG AGTCGGCAGA TGATGCGGGA GAAGAAGGTC ACCATCCTGG 960 

AGCTGTTCCG CTCCCCCGCC TACCGCCAGC CCATCCTCAT CGCTGTGGTG CTGCAGCTGT 1020 

CCCAGCAGCT GTCTGGCATC AACGCTGTCT TCTATTACTC CACGAGCATC TTCGAGAAGG 1080 

CGGGGGTGCA GCAGCCTGTG TATGCCACCA TTGGCTCCGG TATCGTCAAC ACGGCCTTCA 1140 

CTGTCGTGTC GCTGTTTGTG GTGGAGCGAG CAGGCCGGCG GACCCTGCAC CTCATAGGCC 1200 

TCGCTGGCAT GGCGGGTTGT GCCATACTCA TGACCATCGC GCTAGCACTG CTGGAGCAGC 1260 

TACCCTGGAT GTCCTATCTG AGCATCGTGG CCATCTTTGG CTTTGTGGCC TTCTTTGAAG 132 0 

TGGGTCCTGG CCCCATCCCA TGGTTCATCG TGGCTGAACT CTTCAGCCAG GGTCCACGTC 1380 

CAGCTGCCAT TGCCGTTGCA GGCTTCTCCA ACTGGACCTC AAATTTCATT GTGGGCATGT 144 0 

GCTTCCAGTA TGTGGAGCAA CTGTGTGGTC CCTACGTCTT CATCATCTTC ACTGTGCTCC 1500 

TGGTTCTGTT CTTCATCTTC ACCTACTTCA AAGTTCCTGA GACTAAAGGC CGGACCTTCG 1560 

ATGAGATCGC TTCCGGCTTC CGGCAGGGGG GAGCCAGCCA AAGTGATAAG ACACCCGAGG 1620 

AGCTGTTCCA TCCCCTGGGG GCTGATTCCC AAGTGTGAGT CGCCCCAGAT CACCAGCCCG 168 0 

GCCTGCTCCC AGCAGCCCTA AGGATCTCTC AGGAG CACAG GCAGCTGGAT GAGACTTCCA 1740 

AACCTGACAG ATGTCAGCCG AGCCGGGCCT GGGGCTCCTT TCTCCAGCCA GCAATGATGT 1800 

CCAGAAGAAT ATTCAGGACT TAACGGCTCC AGGATTTTAA CAAAAGCAAG ACTGTTGCTC 18 60 

AAATCTATTC AGACAAGCAA CAGGTTTTAT AATTTTTTTA TTACTGATTT TGTTATTTTT 1920 

ATATCAGCCT GAGTCTCCTG TGCCCACATC CCAGGCTTCA CCCTGAATGG TTCCATGCCT 198 0 

GAGGGTGGAG ACTAAGCCCT GTCGAGACAC TTGCCTTCTT CACCCAGCTA ATCTGTAGGG 2 0 40 

CTGGACCTAT GTCCTAAGGA CACACTAATC GAACTATGAA CTACAAAGCT TCTATCCCAG 2100 

GAGGTGGCTA 1GGCCACCCG TTCTGCTGGC CTGGATCTCC CCACTCTAGG GGTCAGGCTC 2160 

CATTAGGATT TGCCCCTTCC CATCTCTTCC TACCCAACCA CTCAAATTAA TCTTTCTTTA 222 0 

CCTGAGACCA GTTGGGAGCA CTGGAGTGCA GGGAGGAGAG GGGAAGGGCC AGTCTGGGCT 22 80 

GCCGGGTTCT AGTCTCCTTT GCACTGAGGG CCACACTATT ACCATGAGAA GAGGGCCTGT 2340 

GGGAGCCTGC AAACTCACTG CTCAAGAAGA CATGGAGACT CCTGCCCTGT TGTGTATAGA 2400 

TGCAAGATAT TTATATATAT TTTTGGTTGT CAATATTAAA TACAGACACT AAGTTATAGT 2460 

AT AT CTGGAC AAGCCAACTT GTAAATACAC CACCTCACTC CTGTTACTTA CCTAAACAGA 2 520 

TATAAATGGC TGGTTTTTAG AAACATGGTT TTGAAATGCT TGTGGATTGA GGGTAGGAGG 2580 

TTTGGATGGG AGTGAGACAG AAGTAAGTGG GGTTGCAACC ACTGCAACGG CTTAGACTTC 2640 

GACTCAGGAT CCAGTCCCTT ACACGTACCT CTCATCAGTG TCCTCTTGCT CAAAAATCTG 2 700 

TTTGATCCCT GTTACCCAGA GAATATATAC ATTCTTTATC TTGACATTCA AGGCATTTCT 2760 

ATCACATATT TGATAGTTGG TGTTCAAAAA AACACTAGTT TTGTGCCAGC CGTGATGCTC 2B2 0 
AGGCTTGAAA TCGCATTATT TTGAATGTGA AGGGAA 

Seq ID NO: 81 Protein sequence: 
protein Accession #: NP_006507.1 

1 11 21 31 41 51 

I I I I I I 

MEPSSKKLTG RLMLAVGGAV LGSLQFGYNT GVINAPQKVI EEFYNQIKVH RYGESILPTT 60 

LTTLWSLSVA IESVGGMIGS FSVGLFVNRF GRRNSHLMMN LLAFVSAVLM GFSKLGKSFE 120 

MLILGRFIIG VYCGLTTGFV PMYVGEVSPT AFRGALGTLH QLGIWGILI AQVFGLDSIM 180 

GNKDLWPLLL SIIFIPALLQ CIVLPFCPES PRFLLINRNE ENRAK3VLKK LP.GTADVTHD 240 

LQEMKEESRQ MMREKKVTIL ELFRSPAYRQ PILIAWLQL SQQLSGINAV FYYSTSIFEK 300 

AGVQQPVYAT IGSGIVNTAF TWSLFWER AGRRTLHLIG LAGMAGCAIL MTIALALLEQ 360 

LPWMSYLSIV AIFGFVAFFE VGPGPIPWFI VAELFSQGtP PAAIAVAGFS NWTSKFIVOH 42 0 

CFQYVEQLCG PYVFIIFTVL LVLFFIFTYF KVPETKGRIF DEIASGFRQG GASQSDKTPE 480 



ELFHPLGADS QV 



sequer: 
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A GCGGCCGTGA 
T GCAGCGATGG 
GCCCATGCCC TTCTTTTACC 



GAACGGACGA 

AAATATT1CC 
AGAGACCCAA 



TGCAAATGGA 
ATGGTTGCGA 
AAGCGGTTTC 
TACTGCAATT 



CTGAGCC1TC 
AGGTGCAGTT 



CTCCATTGCA 
CGGAGCATGG 
TTACCICTTG 



TCCTGGAAGA 
TAGAGGGGCC 
GCTGTC-GTGG 



A GACCGITGTC 



CCTCTGAGGG 



C1TCCGACCT 



AGGGCTGCCC 
CTACCAGATT CCAGGAGGCA 
ACCAGCTGGC ACAGGTGCAC 
ACTTAGGCCA AGTAGAGAGC 
CATCCATGGG GAGCTGAGAA 
TTCAAAAGTT CACGAAAAAA 



GATGGGGAGG 
TTCAGGTGAC 
TGAAGGACTC 
GGTGGAGGCG 
GAAGATAACT 
AGATTCATAA 
ATCAGGGTAA 
ATCAGACTCA 



AGTGATTTTG 
GAGGCCTAAG 



CTGTGGATGG 
AATTGTGTTG 
ATTCCCACAC 
ATGGCGTTCA 
AAGTTCCACC 
AAAAAAAAAA 



CCTCAAGGGT T 
GTGACAAC-TT TTTCTCTTTG 
TTTTCCTTGA C 



TGGGGGAGTC 1 

GGCACACGTT 
TCAACCTTTC 
AAGAAACITA GACTTCACCC 
GTGTGTGITC AACATCTGAA 
TTTCTCTC-TT AAGATGCAGC 
AAAAACAAAT ACAAGGGGAC 



PCT/US02/12476 



CNLEGPPINS SVFKEYAGSM GESCGGLWLA ILLLLASIAA GLSLS 



40 
45 
50 



I 



21 



I 



I 



AAGCAAGGCA 
ATTCTTACAG 
CTCCTCACCT 
GAGCACAAAC 
CCTTCCCCTT 
GTCACGCCAG 
GAACACATAG 
GGAGCTCTAA 
GATGAGCCCA 
CTCTTGCAAC 
AGTCCCCTGA 



TCTTGAAGCC 



1CTT AAAAAAAAGC CATGACGGCT C 
TTGTATTATT TCTAATTTAT TTTGGATGTC A 
AGTCTCCTTC TTTCTAACCC GGCTCTCCCG A 
CGCCGCCGCC GCCGCCGCCG O 
AACCCCAGCA CTTAAGCAAA CGGGAATTCT C 
ATGATGAACC AGACCACGGC CCGTTGGGAG CTCCAGAAGG Q 
GTGGGCAGTG CCAGATGAAC TTCCCATTGG G 
GGAAACAATG CAATGGCAGC CTCTGCTTAG AAAAAGCTGT G 
CACCAATCGA GATGAAAAAA GCATCCAATC CCGTGGAGGT T 
AAGATGACGA TTGTTTATCA 1 
CAGATAAACT TCTGCACTGG AGGGGCCTCT CCTCCCCTGG TTCTGCACAT 
TCCCCACGCC TGGGATGAGT GCAGAATATG CCCCGCAGGG TA7TTGTAAA 



CACTCATGGA 



CTGTTTAGTC 
GAAGAAATGG 
CCAATGGCTA 



ATGGGATTCA TATTGCAGAC 
CGAGAGAGGC TTCCGGCCTG 
CACCACCGAG ACATCACTTG 



TCAGGACTAG GTGCAGAATG 
AATAACCCCT TTAACCTGCT 
GCAGAAGGGC GCTTTCCACC 
GACCCCCACC GCATAGAGCG 



TCCTTCCCAG 



AACACGTCTA O 



CCATTCCAGC 
TCCGCCCCTC 
ACGTTCAAAT 



r GTCCCCAGGC O 



CAGGTAGCAA GCCGCCCTTC 
CTCCCTCCCA GCCCCCGGTC 
TTCAGAGCAA CCTGGTGGTG 
ACCTGTGCGA CCACGCGTGC 
TGCACAAATC 



CTGGCGACGC 
AAGTCCAAGT 
CACCGGCGCA 



GACTTAGAGA 
CTATGCAAAG 
CCCCCCTCCC 



GTTACTGCAA 



CTGCGGCAAG 
CGAGAAGCCC 
GCGCCACATG 
TCTCTCCACC 
CGCGCTCAAG 




GACGTCATGC 
GTCCTGGGCG 
TGCGACGAAG 



AGCCCCAGCT 
CCCCCGGCCA 
GCCTCCAGGC 
GCCTCCTCGT 
GAGCTGGACG 



GCTCGCGGGG CGCGGTCGTG 
AGGGCATGGT GCTCAGCTCC 
AGAAGCATAA GCGCGGCCAC 
ACTCGGTGGC CGGCGAGTCG 
CCCCGGGCGA GTCGGCCTCG 
CGCTGAGCCC CTTCTCTAAG 
CGATGCCCAA CACGGAGAAC 



ATGCAGCACT 
CTGGCCGAGG 
GACCGCATAG 



ACGAGAGCCG 
TCAGCGAGGC 
CCGAGGGCCA 



GGCGCGCCAC 
CGCCCTGCCC 
CTTCCACCAG 
CAGGGACACT 
TG7TAATGGC 




TGTAAGATGC 



CGGAGCACTC CTCGGAGAAC GGGAGCTTGC G 

GAGGGATCTC GGGGCGCAGC GGCACGGGAA GTGGAGGGAG CACGCCCCAT 
CGGGCACGGG CAGGCCCAGC TCAAAAGAGG GCAGACGCAG O 
GGAAAGTCTT CAAGAACTGT AGCAATCT 
GGCCTTATAA ATGCGAGCTG TGCAACTA 
ACATGAAAAC GCATGGCCAG GTGGGGAAGG ACGTTTACAA ATGTGAAATT 
CTTTTAGCGT GTACAGTACC CTGGAGAAAC ACATGAAAAA ATGGCACAGT 
TATAAAAACT GAATAGAGGT ATAT7AATAC CCCTCCCTCA 
TTTCACCACT CCCTTTCCCC ATCGCCCTCC AGCCCCACTC 
TTTTTTCTAG TCCCATGTGA TTTAAACAAA CAAACAAACA AACAGAAGTA 
GAATATGAGA GTGCTTGTCA CCAGCACACG TGTTTTTTTT CTTTITCTTT 
TTTTTCCTTT TTTTTTTTTT TCCTTTATGT TCTCACCGTT TGAATGCATG 
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ATCIGTATGG GGCAATACTA TTGCATTTTA CGCAAACTTT GAGCCTTTCT CTTGTGCAAT 3060 

rGTATGT TTTTTTTTAA ACTTAGACAG CATG7ATGGT ATGTTATGGC 312 0 

T TGTCCCTAAT TCGTTGCTGA GCAAACATGT TGCTGTTTCC AGTTCOGTTC 318 0 



TATATTGTAT TTCTCACAAC A 
ACCCATTATG TCCTAGTTAA TCATCATTTT TCCTTTAGTT TAATTTTATA 3660 



TGGGCTGTTT TGCCCAAAGT TTTATTTTTT TTAAACAATG ATTAAATTGA ATGTGTAATG 3780 

TGCAAAAGCC CTGGAACGCA ATTAAATACA CTAGTAAGGA GTTCATTTTA TGAAGA7ATT 3840 

TGCTTTAATA ATGTCTTTTT AAAAATACTG GCACCAAAAG AAATAGATCC AGATCTACTT 3900 

GGTTGTCAAG TGGACAATCA AATGATAAAC TTTAAGACCT TGTA7ACCAT ATTGAAAGGA 3960 

AGAGGCTGAC AATAAGGTTT GACAGAGGGG AACAGAAGAA AATAATATGA TTTATTAGCA 4020 

CAACGTGGTA CTATTTGCCA TTTAAAACTA GAACAGGTAT ATAAGCTAAT ATTGATACAA 4080 

TGATGATTAA CTATGAATTC TTAAGACTTG CATTTAAATG TGACATTC-T AAAAAAAGAA 4140 

GAGAAAGAAT TTTAAGAGTA GCAGTATATA TGTCTGTGCT CCCTAAAAGT TGTACTTCAT 420 0 

TTCTTTTCCA TACACTGTGT GCTATTTGTG TTAACATGGA AGAGGATTCA TTGTTTTTAT 4260 

TTTTATTTTT TTAATTTTTT CTTTTTTATT AAGCTAGCAT CTGCCCCAGT TGGTGTTCAA 432 0 

ATAGCACTTG ACTCTGCCTG TGATATCTGT ATCTTTTCTC TAATCAGAGA TACAGAGGTT 4380 

GAGTATAAAA TAAACCTGCT CAGATAGGAC AATTAAGTGC ACTGTACAAT TTTCCCAGTT 444 0 

TACAGGTCTA TACTTAAGGG AAAAGTTGCA AGAATGCTGA AAAAAAATTG AACACAATCT 4500 

CATTGAGGAG CATTTTTTAA AAACTAAAAA AAAAAAAACT TTGCCAGCCA TTTACTTGAC 4560 

TATTGAGCTT ACTTACTTGG ACGCAACATT GCAAGCGCTG TGAATGGAAA CAGAATACAC 462 0 

TTAACATAGA AATGAATGAT TGCTTTCGCT TCTACAGTGC AAGGATTTTT TTGTACAAAA 468 0 

CTTTTTTAAA TATAAATGTT AAGAAAAATT TTTTTTAAAA AACACTTCAT TATGTTTAGG 474 0 

GGGGAACTGC ATTTTAGGGT TCCATTGTCT TGGTGGTGTT ACAAGACTTG TTATCCATTT 4800 

AAAAATGGTA GTGGAAATTC TATGCCTTGG ATACACACCG CTCTTCAGGT TGTAAAAAAA 4860 

AAAAACATAC ATTGGGGAAA GGTTTAAGAT TATATAGTAC TTAAATATAG GAAAATGCAC 4920 

3 ATTCCTATGC TAAAATACAT TTATGGTCTT TTTTCTGTAT TTCTAGAATG 4980 

VTGTTCA TCTAGTGTTA GGCACTATAG TATTTATATT GAAGCTTGTA 504 0 

3 TTGCTTGTTC TCTTAAAAGG TATCAATGTA CCTTTTTTGG TAGTGGAAAA 510 0 

A GGCTGCCACA GTATATTTTT TTAATTTGGC AGGATAATAT AGTGCAAATT 5160 

VAAAAAA AAAAAAAGAG AGAAACAAAA AAGTGTGACA TTACAGATGA 5220 

r AATGGCGGTT TGGGGGAGCC TGCTAGAATG TCACATGGAT GGCTGTCATA 52 80 

GGGGTTGTAC ATATCCTTTT TTGTTCCTTT TTCCTGCTGC CATACTGTAT GCAGTACTGC 534 0 

AAGCTAATAA CGTTGGTTTG TTATGTAGTG TGCTTTTTGT CCCTTTCCTT CTATCACCCT 5400 

ACATTCCAGC ATCTTACCTT CATATGCAGT AAAAGAAAGA AAGAAAAAAA AAGGAAAAAA 546 0 

AAAAAAAAAC CAATGTTTTG CAGTTTTTTT CATTGCCAAA AACTAAATGG TGCTTTATAT 552 0 

TTAGATTGGA AAGAATTTCA TATGCAAAGC ATATTAAAGA GAAAGCCCGC TTTAGTCAAT 55 8 0 

ACTTTTTTGT AAATGGCAAT GCAGAATATT TTGTTATTGG CCTTTTCTAT TCCTGTAATG 5640 

AAAGCTQTTT GTCGTAACTT GAAATTTTAT CTTTTACTAT GGCAGTCACT ATTTATTATT S700 

GCTTATGTGC CCTGTTCAAA ACAGAGGCAC TTAATTTGAT CTTTTATTTT TCTTTQTTTT 5760 

TATTTTTTTT TTTATTTAGA TGACCAAAGG TCATTACAAC CTGGCTTTTT ATTGTATTTG 5820 

TTTCTGGTCT TTGTTAAGTT CTATTGGAAA AACCACTGTC TGTGTTTTTT TGGCAGTTGT 5880 

CTGCATTAAC CTGTTCATAC ACCCATTTTG TCCCTTTATT GAAAAAATAA AAAAAATTAA 5940 



MSRRKQGKPQ HLSKREFSPE PLEAILTDDE PDHGPLGAPE GDHDLLTCGQ CQMNFPLGDI 
LIFIEHKRKQ CNGSLCLEKA VDKPPSPSPI EMKKASNPVE VGIQVTPEDD DCLSTSSRRI 
CPKQEHIADK LLHWRGLSSP RSAHGALIPT PGMSAEYAPQ GICKDEPSSY TCTTCKQPFT 
SAWFLLQHAQ NTHGLRIYLE SEHGSPLTPR VGIPSGLGAE CPSQPPLHGI HIADNNPFNL 
LRIPGSVSRE ASGLAEGRFP PTPPLFSPPP RHHLDPHRIE RLGAEEMALA THHPSAFDRV 
LRLNPMAMEP PAMDFSRRLR ELAGNTSSPP LSPGRPSPMQ RLLQPFQPGS KPPFLATPPL 
PPLQSAPPPS QPPVKSKSCE FCGKTFKFQS NLWHRRSHT G3KPYKCNLC DHACTQASKL 
KRHMKTHMHK SSPMTVKSDD GLSTASSPEP GTSDLVGSAS SALKSWAKF KSENDPNLIP 
ENGDEEEEED DEEEEEEEEE EEEELTESER VDYGFGLSLE AARHHENSSR GAWGVGDES 
RALPDVMQGM VLSSMQHFSE AFHOVLGEKH KRGHLAEAEG HRDTCDEDSV AGESDRIDDG 
3 ESASGGLSKK LLLGSPSSLS PFSKRIKLEK EFDLPPATMP NTENVYSQWL 



QSSKLTRHMK THGQVGKDVY KCEICKMPFS VYSTLEKHMK KWHSDRVLNN DIKTE 



GCTCGCTGGG CCGCGGCTCC CGGGTGTCCC AGGCCCGGCC GGTGCGCAGA GCATGGCGGG 
TGCGGGCCCG AAGCGGCGCG CGCTAGCGGC GCCGGCGGCC GAGGAGAAGG AAGAGGCGCG 
GGAGAAGATG CTGGCOGCCA AGAGCGCGGA CGGCTCGGCG CCGGCAGGCG AGGGCGAGGG 
CGTGACCCTG CAGCGGAACA TCACGCTGCT CAACGGCGTG GCCATCATCG TGGGGACCAT 
TATCGGCTCG GGCATCTTCG TGACGCCCAC GGGCGTGCTC AAGGAGGCAG GC7CGCCGGG 
GCTGGCGCTG GTGGTGTGGG CCGCGTGCGG CGTCTTCTCC ATCGTGGGCG CGCTCTGCTA 
CGCGGAGCTC GGCACCACCA TCTCCAAATC GGGO"( CGftC TA( 1 t T 

CTACGGCTCG CTGCCCGCCT TCCTCAAGCT CTGGATCGAG CTGCTCATCA TCCGGCCTTC 
AT CGCAGTAC ATCGIGGCCC TGGTCTTCGC CACCTACCTG CTCAAGCCGC TCTTCCCCAC 
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CTGCCCGGTG CCCGAGGACC CAGCCAAGCT CCTCGCCIGC CTCTCCCTCC TGCTGCTCAC 600 
GGCCGTGAAC TGCTACAGCG TGAAGGCCGC CACCCGGGTC CAGGATGCCT TTGCCGCCGC 660 
CAAGCTCCTG GCCCTGGCCC TGATCATCCT GCTGGGCTTC GTCCAGATCG GGAAGGGTGA 72 0 
TGTGTCCAAT CTAGATCCCA ACTTCTCATT TCAACC:ACC V,. V.77 ;CAT ' 7GG3GAACAT 780 
TGTGCTGGCA TTATACAGCG GCCTCTTTGC CTATGGAGGA rGGAATTACT TGAATTTOGT 840 
CACAGAGGAA ATGATCAACC CCTACAGAAA CCTGCCCCTG GCCATCATCA TCTCCCTGCC 900 
CAT CGTGACG CTGGTGTACC TCCTCACCAA CCrGGCGTAC T7CAGCACCC TGTCCACCGA 960 

GCAGATGCTG TCGTCCGAGG CCGTGGCCGT GGAC77GGGG AAC7A7CACC 7GGC-CGTCAT 1020 

GTCCTGGATC ATCCCCGTCT TCGTGGGCCT GTCCTGCTTC GGCTOCGTCA ATGGGTCCCT 1080 

GTTCACATCC TCCACGCTCT TCTTCGTGGG GTCCCGGGAA GGCCACCTGC CCTCCATCCT 1140 

CTCCATGATC CACCCACAGC TCCTCACCCC CGTGCCGTCC CTCGTGT7CA CGTGTGTGAT 1200 



C TGGCTGCGCC ACAGAAAGCC 1320 

G CGGCCCATCA AGGTGAACCT GGCCCTGCC7 GTGTTCT7CA TCCTGGCCTG 1380 

G ATCGCCGTCT CCTTCTGGAA GACACCCGTG GAGTGTGGCA TCGGCTTCAC 1440 

CATCATCCTC AGCGGGCTGC CCGTCTACTT CTTCGGGGTC TGGTGGAAAA ACAAGCCCAA 1500 

GTGGCTCCTC CAGGGCATCT TCTCCACGAC CGTCCTGTG7 CAGAAGCTCA TGCAGGTGGT 1560 
CCCCCAGGAG ACATAGCCAG GAGGCCGAGT GGCTGCCGGA GGAGCATGC 



31 
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25 MAGAGPKRRA LAAPAAEEKE EAREKMLAAK SADGSAPAGE GEGVTLQRNI TLLNGVAIIV 
GTIIGSGIPV TPTGVLKEAG SPGLALWWA ACGVFSIVGA LCYAELGTTI SKSGGDYAYM 
LEVYGSLPAF LKLWIELLI I RPSSQYIVAL VFATYLLKPL FPTCPVPEEA AKLVACLCVL 
LLTAVNCYSV KAATRVQDAF AAAKLLALAL IILLGFVQIG KGDVSNLDPN FSFEGTKLDV 
GNIVLALYSG LFAYGGWNYL NFVTEEMINP YHNLPLAIII SLPIVTLVYV LTNLAYFTTL 
30 STEQMLSSEA VAVDFGKYHL GVMSWIIPVF VGLSCFGSVN GSLFTSSRLF FVGSREGHLP 
SILSMIHPQL LTPVPSLVFT CVMTLLYAFS KDIFSVINFF SFFNWLCVAL AIIGMIWLRH 
RKPELERPIK VMLALPVFFI LACLFLIAVS FWKTPVECGI GFTIILSGLP VYFFGVWWKN 
KPKWLLQGIF STTVLCQKLM QWPQET 

35 



I I I I I 

TAAAAAGCAA AAGAATTCGC GGCCGCGTCG ACACGGGCTT CCCCGAAAAC CTTCCCCGCT 
TCTGGATATG AAATTCAAGC TGCTTGCTGA GTCCTATTGC CGGCTGC 
AGCCCTGAGG AGTAGTCACT CAGTAGCAGC TGACGCGTGG GTCCACC..^ 

45 TCTTTGAGGG ACTCCTGAGT GGGGTCAACA AGTACTCCAC AGCCTTTGGG CGCATC 

TGTCTCTGGT CTTCATCTTC CGCGTGCTGG TGTACCTGGT GACGGCCGAG CGTGTGTGGA 300 

GTGATGACCA CAAGGACTTC GACTGCAATA CTCGCCAGCC CGGCTGCTCC AACGTCTGCT 3 60 

TTGATGAGTT CTTCCCTGTG TCCCATGTGC GCCTCTGGGC CCTGCAGCTT ATCCTGGTGA 42 0 

CATGCCCCTC ACTGCTCGTG GTCATGCACG TGGCCTACCG GGAGGTTCAG GAGAAGAGGC 48 0 

50 ACCGAGAAGC CCATGGGGAG AACAGTGGGC GCCTCTACCT GAACCCCGGC AAGAAGCGGG 540 

GTGGGCTCTG GTGGACATAT GTCTGCAGCC TAGTGTTCAA GGCGAGCGTG GACATCGCCT 60 0 

TTCTCTATGT GTTCCACTCA TTCTACCCCA AATATATCCT CCCTCCTGTG GTCAAGTGCC 66 0 

ACGCAGATCC ATGTCCCAAT ATAGTGGACT GCTTCATCTC CAAGCCCTCA GAGAAGAACA 72 0 

TTTTCACCCT CTTCATGGTG GCCACAGCTG CCATCTGCAT CCTGCTCAAC CTCGTGGAGC 780 

55 TCATCTACCT GGTGAGCAAG AGATGCCACG AGTGCCTGGC AGCAAGGAAA GCTCAAGCCA 840 

TGTGCACAGG TCATCACCCC CACGGTACCA CCTCTTCCTG CAAACAAGAC GACCTCCTTT 900 

CGGGTGACCT CATCTTTCTG GGCTCAGACA GTCATCCTCC TCTCTTACCA GACCGCCCCC 960 

GAGACCATGT GAAGAAAACC ATCTTGTGAG GGGCTGCCTG GACTGG7CTG GCAGGTTGGG 1020 

CCTGGATGGG GAGGCTCTAG CATCTCTCAT AGGTGCAACC TGAGAGTGGG GGAGCTAAGC 1080 

60 CATGAGGTAG GGGCAGGCAA GAGAGAGGAT TCAGACGCTC TGGGAGCCAG TTCCTAGTCC 1140 

TCAACTCCAG CCACCTGCCC CAGCTCGACG GCACTGGGCC AGTTCCCCCT CTGCTCTGCA 1200 
GCTCGGTTTC CTTTTCTAGA ATGGAAATAG TGAGGGCCAA TGC 



a Accession it: NP_005259.1 



MNWSIFEGLL SGVNKYSTAF GRIWLSLVFI FRVLVYLVTA ERVWSDDHKD FDCNTRQPGC 
SNVCFDEFFP VSHVRLWALQ LILVTCPSLL WMHVAYREV QEKRHREAHG ENSGRLYLN? 
GKKRGGLWWT YVCSLVFKAS VDIAFLYVFK SFYPKYILPP WKCHADPCP NIVDCFISKP 



I I I I I I 

CGGGCGAAGC AGCGCGGGCA GCGAGATGCA GCACCGAGGC T7CCTCCTCC TCACCCTCCT 
CGCCCTGCTG GCGCTCACCT CCGCGGTCGC CAAAAAGAAA GATAAGGTGA AGAAGGGCGG 
CCCGGGGAGC GAGTGCGCTG AGTGGGCCTG GGGGCCCTGC ACCCCCAGCA GCAAGGATTG 
CGGCGTGGGT TTCCGCGAGG GCACCTGCGG GGCCCAGACC CAGCGCATCC GGTGCAGGGT 
GCCCTGCAAC TGGAAGAAGG AGTTTGGAGC CGACTGCAAG TACAAGTTTG AGAACTGGGG 
TGCGTGTGAT GGGGGCACAG GCACCAAAGT CCGCCAAGGC ACCCTGAAGA AGGCGCGCTA 
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CAATGCTCAG TGCCAGGAGA CCATCCGCGT CACCAAGCCC TGCACCCCCA AGACCAAAGC 
AAAGGCCAAA GCCAAGAAAG GGAAGGGAAA GGACTAGACG CCAAGCCTGG A7GCCAAGGA 
GCCCCTGGTG TCACATGGGG CCTGGCCACG CCCTCCCTCT CCCAGGCCCG AGATG7GACC 
CACCAGTGCC TTCTGTCTGC TCGTTAGCTT TAATCAATCA TGCCCTGCCT TGTCCCTCTC 
ACTCCCCAGC CCCACCCCTA AGTGCCCAAA GTGGGGAGGG ACAAG3GATT C7GGGAAGCT 
TGAGCCTCCC CCAAAGCAAT GTGAGTCCCA GAGCCCGCTT T " ~" 

A AACACATCAA ATAAACTGAC TTTTTCCCCC CAATAAAAGC T 
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15 
20 



MQHRGFLLLT LLALLALTSA VAKKKDKVKK 
Q3AQTQRIRC RVPCNWKKEF GADCKYKPEN 
RVTKPCTPKT KAKAKAKKGK GKD 



Seq ID NO: 92 DNA sequence 
Nucleic Acid Accession tt: N 
Coding sequence: 98-802 

1 11 21 



CTCTACCTGA 



GCTCTCCTTC 
GAATGGACTT 
TAAGCAGAAA 
CAGATGGGCT 
GGACCATGAA 



CACAGCTGCA GCCTGCAATT 
GAACAAGGTG AACGCCCAGC 
CTCCTACTGG CTGCTCAGGT 
CACAGCAAAG TGGTCTCAGA 
AGCAGGCCCG GGAACAAAGG 
GCTACTGAGC AGGAGGAGGG 



CACTCCCACT 



ACAAAAGGAC 
CAAGTTTGTC 
CATCTCTCTC 



TAAGCTAGTC 



GACTGCCCTG 
AGTGCAGGAC 
TGTCGTAAGT 
TGTGCTTAGT 
TGGAATTTGC 
TTCCATGGCC 
GAGTGATAAT 
TTTTTCAAAA 



TATTGGAAAC 
ACAGCTGTGA 
AGCTCCACTC 
GAGCACATCA 
ACCAAAGCTC 
GAGTTCTGTG 
ACGTCATGCT 



AAGTTGCCCG 
AAACCAGAGT 
TATTTGGGAA 



GAATCTGCGC 
GTGCAGAAAG 
CACAAAGCCC 



41 

I 

" GCCTGGGATT 
AAGATCTGTA 
GAGGGGAAAA 
ACTCTGGGCA 
ACCAAAGACC 
AAGGTTGAGT 
TCATGCCTAA 
TCACAGAAAG 
GATTTTCCAG 



51 
1 

- GCACTGGATC 
GCCTCACCCT 
AAAAAGTGAA 
ACACCCAGAT 
AAGCCAACTG 
GCACTCAATT 
AGCTCAAGGA 
ACATCTGTAG 
AATCCAGTCT 
AAACAGAGAT 



GAGTGCAACG 
CTTATTTTTC 
CACACAGCTA 
TTCAGTGCAA 
AAAAAAAAAA 



ACTTTAAAGC 
AAATATTTAA 
TTGGATGCGA 
TGTGTTTGAG 
CGAACTTTCT 



GGTTCCTTTA A 



ACCAGAGGAA 
TCCTCAGCAT 
AGAGATGTCA 
TCTCTACAGT CCCCCCAAAA TATGAACTTT 



TGTTCAGAGG CTGTTTCCTG CAGCATGTAT 
CAGCGAAGAG TCTTTGAGCT GAATGAGCCA 
GCTGAATTAA TGGTAATAAA ACTCTGGGTG 



Seq ID NO: 93 Protein sequence: 



MKICSLTLLS FLLLAAQVLL VEGKKKVKNG LHSKWSEOK DTLGNTQ I KQ KSRPGNKGKF 
VTKDQANCRW AATEQEEGIS LKVECTQLDH EFSCVFAGNP TSCLKLKDER VYWKQVARNL 
RSQKDICRYS KTAVKTRVCR KDFPESSLKL VSSTLFGNTK PRKEKTEMSP REHIKGKETT 
PSSLAVTQTM ATKAPECVED PDMANQRKTA LEFCGETWSS LCTFFLE IVQ DTSC 

Seq ID NO: 94 DNA sequence 

Nucleic Acid Accession ft: NM_C12101 

Coding sequence: 125-1891 



CTCCTCACAG GTGTGTCTCT 
TGCCAGAAAG GTCACCTATC 
TGCGATGGAA GCTGCAGATG 



AGTCCTCGTG 
CTGAACCCCA 
CCTCCAGGAG 



TGCCAAGACC 
CCTGAAGCCA 
CATCCAGTTT 
GGAAGGCAAG 
TACCTTTGCC 



ACCAACGGGC 
GGGGAAGGTA 
GTCGAGTCCG 
AGGTCGCCGT 
GAAAAGGGCG 
ATGGAGCCCG 
CGGTCCAAGT 
GCGGTCAAGT 



GTTGCCTGCC 
GCAAGCCTGA 
CAACGGGTCG 
GGAGAATGGC 



GGGACGACAA 
ACGCAGGGCT 
ACGTGCGCAA 



CTTTGAGGCC CGCAAGTGTC 
CCAGACCTGC ATCTGCTACC 
AGTGGAGGAG GCCAAGGCCG 
GCTCAAGATC ATTGAGATTG 
CAAGAGCTTC ACCACCAATG 
GGACCTGGAG AAGCAAAAGG 
TGTGGACCAA GTGAAGGTGA 
GGACAAGCAG ACCCGGGAGC 



CCGGCTCCGA 
CCTGCCTGGT 
CCGCCTTCCG 
CCGTGCATGG 
TTTGCATGTT 



AGAAGGCCAT 
TCATGGATGC 



GTTCGCGGGC 
GAACTCCAAC 
CCAGCTGGGG 
GTCCATTTTC 
GCGGAACAGC 
GGAGGTGCTG 
GTGCCAGGCC 
AGACCACCAG 
CAAGACGATG 
CCAGGAGCAC 
GGAGCTGTCA 
TGAGAAGTGG 
CCTGGAGCAG 
GGCTGCGCTG 



ACCAAGGCTG 
GGCAAGAGCC 
AATGAGTGGC 
TACTTCAGCA 
GCTGCCAAGA 
TCGGAGTCCC 



CCAAGCACCC 
CCAGGGATGC 
ACGGCAAGGA 



GGCGACCCAT 360 
TGGACTCTAT 420 



GGAAGCCCAC 



CTGCTCGAGC 
GAGCTCTTCT 
AAGAATCATA 
CTGCAAAAGG 



ATTACTCTCT 
GACAGTCACT 
AGATGTGCAA 



CATCAGCGAC TCTGTGT7GT 
CCCCCCACCC CTGCCCACCT 
AGGCAACTTC 



GCIGGAGGGG GAGGGCCTGG 
ATGCATGCGC CACGTTGAGA 
GAACCACATG GAGAACGGTG GTGACCATCG CTATGTGAAC AACTACACGA AC 



GCATCGGCAA 
AGCTGCATCT 
CCATCCGGGA 
GCCAGACCGA 
GCACCGTGAC 
AGCAGCTGCA 

ACCTGGTGCG 
AGCAGGAIGC 
TGCTGCATGA 
TTCTGCAGGA 
ATCATGTCCT 
TGCTCAATGT 
TCATTGAGAG 



223 
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TGACCTCAGA 



TGGGGTCCGG ACATCATACC 
GAAGAATTTC AACAATCTCT 
CTCCTCCAGC ATTCAGAACT 
CTCCCTGAAA GGCTATCCCT 
TTGGAAATCT GGCAAGCAGA 
CAACGGGATT GGGTCCAACG 
CCCCTGCTCT TCCTCCTGAC 



CTCCGACTTC 
TGGTCACCAT 
CCTGCCCTAA 
CCGCATGGTA 
TTCTCTCCAA 
ATCTCCCATT 
TCGTCCTACC 
AAACCTCTCA 
TGAGTCTGTG 
ATTCCTCTGT 
CAAATAGCTA 
ACTTTGGCAT 
CACTGCCCCC 



CTGACAATGA 
CCCTCATGCG 
CTATGCTGTC 



GAGATACTCC 
TCCTGGCCGC 
AGGTAACTAC 
CCTGCCCGTC 
GAGCCAAAGC 
TCACTACCGG 
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ACAGCCACCC 
GTGCTCTCTC 
CTGCAGATGG 
ACAGCCACTT 



TCCACCCATG 
TTCAGTCTAC 
GTGCCTTACA 
TGATTACCCC 
AGCAGCACAG 



CCTGCTGCTC 
CACCTGCCCT 
CCCACTGGCC 
CATTCCTGTG 
CCCGCCAGCC 
TCAGCCTGCC 
GGCCCTATCC 
CACATGGCCC 
TATCAATGCC 
GTGTCTTGAC 
GTCCCTGGAG 
TCCCTCTGCT 
CTGGCCCAGC 
TCTCTCTGGC 
ACCCTCAGCC 
TATCAGGGTG 
TCCCGTCTCA 



TTGCCTTCTA 
CTGCAGCCCT 
ACACTCCATT 
CTCAGAGGCC 
TCCTCCTCTC 



TTCACCAAGG AGACCACCCA 
ACCTCCCGGG TCTGGGAGTA 
GTCCAAGGCA GCTCCTCCTT 
CCCAAGGCCC AGCCCCAGAC 
CCATTCTACG TCAACAAAGG 
GGAAGGAACG AGGCGCCACA 



ACCTCCTGCT 
CAGCATGGCA 
ATCACCCTAC 
GGTGGCTTCT 
GAGATAAAGA 



GGGCTGGATC 
ACGCCCTGCT 
TCAGCAGATG 



GAACCTGCAG 



CTTGGGGGCA 
TCCTGCCTTG 
AGGGGTGAGA 
TGGGGGCTAG 
GTCTCCAGGC 
CCTGGACAGC 
CTGGCCCTAC 



2160 



ATAAACCATT GGTCTGTC 



GATGGAGTGT 
GTTGCCCCAT 
CTCAAGGATT 
ACAGCCCCAG 
AGACATCCAG 



CCTGACTGGC 
ATTCCCTTAA 
CATTTGCCTA 
GGCTGGGCTG 
CAGAGGCTGC 



TGGCCAAGGG 
GGTCTCCACC 
AGGATGACCT 
CATGATATAA 
CAGAATTTCA 
ACCGCAAAAG 
CTCCTCCTTC 
CAAAACCAGG 
GCTCTGGAAG 
CCCCTTTGGA 



QKAVKSCLVC 
TCICYLCMPQ 
SFTTNEKAIL 
KQTREQLHSI 
MRHVEKMCKA 
VRTSYQPSSP 



GSSPEARDAR 
AGNEWRRPII 
IPSESRKPTV 
QASFCELHLK 



I 



EQNFRDLVRD 
SDSVLFLQEF 
DLSRNFIERN 
GRFTKETTQK 
QSPKAQPQTW 



SPSGPSGSLE MGTKADGKDA KTTNGHGGEA 
QFVESGDDKN SNYFSMDSME GKRSPYAGLQ 
SIMEPGETRR NSYPRADTGL FSRSKSGSEE 
PHLEGAAFRD HQLLEPIRDF EARKCPVHGK 
EEAKAEKETE LSLQKEQLQL KIIEIEDEAE 
LEKQKEEVRA ALEQREQDAV DQVKVIMDAL 
GALMSNYSLP PPLPTYHVLL EGEGLGQSLG 



AEGKSLGSAL 
LGAAKKPPVT 
VLCDSCIGNK 
TMELFCQTDQ 
KWQKEKDRIK 



NFKDDLLNVC 420 



S SSIQNSDIOL PWQGSSSFS 



I 



I 



AGGCTCTGAA 
AGTCAGAAAG 
CCAATCACCT 



CAGCACTCCT 
CAGAGACTTG 
CTCTGCCTCT 
GGCAGAAGAC 
GGTTTGTGCA 
GAAACAGAAA 
GGCTGCGGCC 
AGATGCAGTG 
CCTGTGGAGA 
GCTGCGCATG 



CCCTCCTGAG 
TTCAGTTCCG 
GAGGGCAATT 



GTTTTCTAGT 
GCTGCTTGGA 
GCTCATGGCC 
TGCAGGGGTG 
ACCGAGTCTA 
CAACTATGCT 
ATAGCAATTT 
CTCATGATCT 
TTTGGATTTG 
TTTGATGTTT 
CAGAGAATGC 
AGGCGGGCCC 



CCTTCCCGGT TGGCGCGCGC CCGGGGCGGC GGCGCTGGAG 

CGGAGCCTAG TTATGTCTGG GAGGCGAACG CGGTCCGGAG GAGCCGCTCA 
CCAAGGGCCC CATCTCCTAC TAAGCCTCTG CGGAGGTCCC AGCGGAAATC 
CTCCCGAGCA TCCTCCCTGA AATCTGGCCG AAGACACCCA GTGCGGCTGC 
CCCATCGTCT T AAAGAGG AT CGTGGCCCAT GCTGTAGAGG TCCCAGCTGT 
CGCAGGAGCC CTAGGATTTC CTTTTTCTTG GAGAAAGAAA ACGAGCCCCC 
CTTACTAAGG AGGACCTTTT CAAGACACAC AGCGTCCCTG CCACCCCCAC 
GTGCCGAACC CTGAGGCCGA GTCCAGCTCC AAGGAAGGAG AGCTGGACG C 
GAAATGTCTA AGAAAGTCAG GCGTTCCTAC AGCCGGCTGG AGACCCTGGG 
ACCTCCACCC CAGGCCGCCG GTCCTGCTTT GGCTTCGAGG GGCTGCTGGG 
TTGTCCGGAG TCTCGCCAGT GGTGTGCTCC AAACTCACCG AGGTCCCCAG 
AAGCCCTGGG CCCCAGACAT GACTCTCCCT GGAATCTCCC CACCACCCGA 
CGTAAGAAGA AGAAAATGCC AGAGATCTTG AAAACGGAG C TGGATGAGTG 
ATGAATGCCG AGTTTGAAGC TGCTGAGCAG TTTGATCTCC TGGTTGAATG 
GGGGGTGCAC CTGGCCAGAC TCTCCCTCCT GTCCTGTACA TAGCCACCTC 
GGACACTTAG GGTCCCCTCC CCTGGTCTTG TTACCTGTGT GTGTGCTGGT 
AGGACTGTCT GCCTTTGAGG GCTTGGGCAG CAGCGGCAGC CATCTTGGTT 
GGGCCGCCTG GCCCAGCCAC TCACTGGTGT CCIGTCTCTT GTCGTCCTGT 
TCCCCAAAGT ACCATAGCCA GTTTCCAGAT GGGCCACAGA CTGGGGAGGA 
CCCAGCCAGA AGTTAAAGGG CTGAGGGTTG AGGTGAGAGG CACCTCTGCT 
GGGGTGGCTG CTTGGAAATA GGCCCAGGGG CTCTGCCAGC CTCGGCCTCT 
TTGCCTTCTG TTGGTGGCTT TCTTCTTGAA CCCACCTGTG TAAAGAGGTT 
TGGGTTTCCC CTTTGATTCT GTAAATAGTC CCAGAGAGAA TTCGTGGGCT 
CTGTCTTGGA GGAAGAAGCT GGACATT CAG CCTGTGGAGT CTGAGTTTTG 
GGAGCCTTAG TTGGGTCTCA GACCATAAGT GTGTACTACA CAGAAGCTGT 
TCTGGTCTGC TGTTGAGATG TTTGGTAAAT GCCAGGTTGA TAGGGCX3CTG 
GCAAAGGGTG CATTTCAGGG TGTGGCCACC AGGTGCTGTG AGTTTCTGTG 
TCTGGGCTGG TCCCTTGCAC AGGGCCCACG CTGGAGTCTT ACCACTCTGC 
GAAGGTGGCC CCTCTTGTCA CCCATACCCA TTT CTTACAA AATAAGTTAC 
CTTGGCCCTA GAAGAGAAAG TTGAAGAGTC CCAGACCTAC TAGCAT7TTG 
TGTAAAGTCC TCGGAAAGTT TCCTCGCGTA CCAGACAGCG GCGGGGGCTG 
TAGTTTTTGG CCTCCCTATC CTCTCACATG AGAACAC7GC CTGGATGCAT 
CTGGAGAATT TCCCCATCTT TCT< TC T~ ( AT 3TGTG GATTCAATAG 
AAGGCTGCCC TGCCCCCGAC TCTCCTGCCG CACCCCTGGC 
AGAAGTTCGT GGAAGTAGAC GCTGAG3TGT GCAGAGGAGC TGGTGGATAA 
CAGGGAAGAT GAGTGCTGGG TCAGGGTACT TGGATGAAAC GGTGCAGGCC 
TAATAAAACC CTCTGCCAGG TCTGGGAGTC CCAGGCCATC TGCTCAACGC 
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TCTGTGGTTT GTCAGACCTG CAAGCAAGCC CCCTGCTGGG GAAGCCTAGG TGTCCTTGAG 2280 
A CTGAAGAACT CTTGTCCTCA CTGGCTGATG CAGCAGAACT CTTGGGAAAT 2340 



20 



65 



I I I I I I 

MSGRRTRSGG AAQRSGPRAP SPTKPLRRSQ RKSGSELPSI LPEIWPK7PS A" 

KRIVAHAVEV PAVQSPRRSP RISFPLEKEN EPPGRELTKE DLPKTHSVPA T 

EAESSSKEGE LDARDLEMSK KVRRSYSRLE TLGSASTSTP GRRSCFGFEG LLGAEDLEGV 

SPWCSKLTE VPRVCAKPWA PDMTLPGISP PPEKQKRKKK KMPEILKTEL DEKAAAMNAE 
FEAAEQFDLL VE 



GGGGCATTTC CGGGTCCGGG CCGAGCGGGC GCACGCGCGG GAGCGGGACT CGGCGGCATG 60 

25 GCGGGCTCCG GAGCCGGTGT GCGTTGCTCC CTGCTGCGGC TGCAGGAGAC CTTGTCCGCT 120 

GCGGACCGCT GCGGTGCTGC CCTGGCCGGT CATCAACTGA TCCGCGGCCT GGGGCAGGAA 180 

TGCGTCCTGA GCAGCAGCCC CGCGGTGCTG GCATTACAGA CATCTTTAGT TTTTTCCAGA 240 

GATTTCGGTT TGCTTGTATT TGTCCGGAAG TCACTCAACA GTATTGAATT TCGTGAAIGT 300 

AGAGAAGAAA TCCTAAAGTT TTTATGTATT TTCTTAGAAA AAATGGGCCA GAAGATCGCA 3 SO 

30 CCTTACTCTG TTGAAATTAA GAACACTTGT ACCAGTGTTT ATACAAAAGA TAGAGCTGCT 420 

AAATGTAAAA TTCCAGCCCT GGACCTTCTT ATTAAGTTAC TTCAGACTTT TAGAAGTTCT 4 80 

AGACTCATGG ATGAATTTAA AATTGGAGAA TTATTTAGTA AATTCTATGG AGAACTTGCA S40 

TTGAAAAAAA AAATACCAGA TACAGTTTTA GAAAAAGTAT ATGAGCTCCT AGGATTATTG 600 

GGTGAAGTTC ATCCTAGTGA GATGATAAAT AATGCAGAAA ACCTGTTCCG CGCTTTTCTG 660 

35 GGTGAACTTA AGACCCAGAT GACATCAGCA GTAAGAGAGC CCAAACTACC TGTTCTGGCA 72 0 

GGATGTCTGA AGGGGTTGTC CTCACTTCTG TGCAACTTCA CTAAGTCCAT GGAAGAAGAT 780 

CCCCAGACTT CAAGGGAGAT TTTTAATTTT GTACTAAAGG CAATTCGTCC TCAGATTGAT 840 

CTGAAGAGAT ATGCTGTGCC CTCAGCTGGC TTGCGCCTAT TTGCCCTGCA TGCATCTCAG 90 0 

TTTAGCACCT GCCTTCTGGA CAACTACGTG TCTCTATTTG AAGTCTTGTT AAAGTGGTGT 960 

40 GCCCACACAA ATGTAGAATT GAAAAAAGCT GCACTTTCAG CCCTGGAATC CTTTCTGAAA 1020 

CAGGTTTCTA ATATGGTGGC GAAAAATGCA GAAATGCATA AAAATAAACT GCAGTACTTT 108 0 

ATGGAGCAGT TTTATGGAAT CATCAGAAAT GTGGATTCGA ACAACAAGGA GTTATCTATT 114 0 

GCTATCCGTG GATATGGACT TTTTGCAGGA CCGTGCAAGG TTATAAACGC AAAAGATGTT 12 0 0 

GACTTCATGT ACGTTGAGCT CATTCAGCGC TGCAAGCAGA TGTTCCTCAC CCAGACAGAC 1260 

45 ACTGGTGACG ACCGTGTTTA TCAGATGCCA AGCTTCCTCC AGTCTGTTGC AAGCGTCTTG 13 2 0 

CTGTACCTTG ACACAGTTCC TGAGGTGTAT ACTCCAGTTC TGGAGCACCT CGTGGTGATG 13 8 0 

CAGATAGACA GTTTCCCACA GTACAGTCCA AAAATGCAGC TGGTGTGTTG CAGAGCCATA 1440 

GTGAAGGTGT TCCTAGCTTT GGCAGCAAAA GGGCCAGTTC TCAGGAATTG CATTAGTACT 150 0 

1560 



55 



CCCACATACA AAGACTACGT GGATCTCTTC AGACATCTCC TGAGCTCTGA CCAGATGATG 1680 

TAGCAGATGA AGCATTTTTC TCTGTGAATT CCTCCAGTGA AAGTCTGAAT 1740 

ATGATGAATT TGTAAAATCC GTTTTGAAGA TTGTTGAGAA ATTGGATCTT 1800 

TACAGACTGT TGGGGAACAA GAGAATGGAG ATGAGGCGCC TGGTGTTTGG 18 60 

ATGATCCCAA CTTCAGATCC AGCGGCTAAC TTGCATCCAG CTAAACCTAA AGATTTTTCG 1920 

GCTTTCATTA ACCTGGTGGA ATTTTGCAGA GAGATTCTCC CTGAGAAACA AGCAGAATTT 1980 

TTTGAACCAT GGGTGTACTC ATTTTCATAT GAATTAATTT TGCAATCTAC AAGGTTGCCC 2040 

CTCATCAGTG GTTTCTACAA ATTGCTTTCT ATTACAGTAA GAAATGCCAA GAAAATAAAA 2100 

TATTTCGAGG GAGTTAGTCC AAAGAGTCTG AAACACTCTC CTGAAGACCC AGAAAAGTAT 2160 

60 TCTTGCTTTG CTTTATTTGT GAAATTTGGC AAAGAGGTGG CAGTTAAAAT GAAGCAGTAC 222 0 

AAAGATGAAC TTTTGGCCTC TTGTTTGACC TTTCTTCTGT CCTTGCCACA CAACATCATT 2280 

GAACTCGATG TTAGAGCCTA CGTTCCTGCA CTGCAGATGG CTTTCAAACT GGGCCTGAGC 2340 

TATACCCCCT TGGCAGAAGT AGGCCTGAAT GCTCTAGAAG AATGGTCAAT TTATATTGAC 2400 

AGACATGTAA TGCAGCCTTA TTACAAAGAC ATTCTCCCCT GCCTGGATGG ATACCTGAAG 2460 

ACTTCAGCCT TGTCAGATGA GACCAAGAAT AACTGGGAAG TGTCAGCTCT TTCTCGGGCT 2520 

GCCCAGAAAG GATTTAATAA AGTGGTGTTA AAGCATCTGA AGAAGACAAA GAACCTTTCA 25 80 

TCAAACGAAG CAATATCCTT AGAAGAAATA AGAATTAGAG TAGTACAAAT GCTTGGATCT 2640 

CTAGGAGGAC AAATAAACAA AAATCTTCTG ACAGTCACGT CCTCAGATGA GATGATGAAG 2700 

AGCTATGTGG CCTGGGACAG AGAGAAGCGG CTGAGCTTTG CAGTGCCCTT TAGAGAGATG 2760 

AAACCTGTCA TTTTCCTGGA TGTGTTCCTG CCTCGAGTCA CAGAATTAGC GCTCACAGCC 2820 

AGIGACAGAC AAACTAAAGT TGCAGCCTGT GAACTTTTAC ATAGCATGGT TATGTTTATG 2880 

T GCCAGAAGGG GGACAGGGAG CCCCACCCAT GTACCAGCTC 2940 

T GCTGCTTCGA CTTGCGTGTG ATGTTGATCA GGTGACAAGG 3000 
G AGCCACTAGT TATGCAGCTG ATTCACTGGT TCACTAACAA 
C CTTACTAGAA GCTATATTGG A 
I TTGTGGTCGG TGTATTCGAG A 
A GCAGGAGAAG AGTCCAGTAA A 

A GCCTTGCGCT TCACCCCAAT GCTTTCAAGA GGCTGGGAGC ATCACT7GCC 3300 

TTTAATAATA TCTACAGGGA ATTCAGGGAA GAAGAGTCTC TGGTGGAACA GTTTGTGTTT 3360 

I GGAGAGTCTG GCCTTAGCAC ATGCAGATGA GAAGTCCTTA 3420 

C AACAGTGTTG TGATGCCATT GATCACCTAT GCCGCATCAT TGAAAAGAAG 3480 
CATGTTTCTT TAAATAAAGC AAAGAAACGA CGTTTGCCGC GAGGATTTCC Ai 
TCATTGTGTT TATTGGATCT GGT CAAGTGG CTTTTAGCTC ATTGTGGGAG G 

GAATGTCGAC ACAAATCCAT TGAACTCTTT TATAAATTCG TTCCTTTATT GCCAGGCAAC 3660 

AGATCCCCTA ATTTGTGGCT GAAAGATGTT CTCAAGGAAG AAGGTGTCTC TTTTCTCATC 372 0 

AACACCTTTG AGGGGGGTGG CTGTGGCCAG CCCTCGGGCA TCC7GGCCCA GCCCACCCTC 3780 

TTGTACCTTC GGGGGCCATT CAGCCTGCAG GCCACGCTAT GCTGGCTGGA CCTGC7CCTG 3840 
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GCCGCGTTGG AGTGCTACAA CACGTTCATT GGCGAGAGAA CTG7AGGAGC GCTCCAGGTC 3900 

CTAGGTACTG AAGCCCAGTC TTCACTTTTG AAAGCAGTGG CTTTCTTCTT AGAAAGCATT 3960 

GCCATGCATG ACATTATAGC AGCAGAAAAG TGCTTTGGCA CTGGGGCAGC AGGTAACAGA 402 0 

ACAAGCCCAC AAGAGGGAGA AAGGTACAAC TACAGCAAAT GCACCGT7GT GGTCCGGATT 4080 

ATGGAGTTTA CCACGACTCT GCTAAACACC TCCCCGGAAG GATGGAAGCT CCTGAAGAAG 4140 

GACTTGTGTA ATACACACCT GATGAGAGTC CTGGTGCAGA CGCTGTGTGA GCCCGCAAGC 4200 

ATAGGTTTCA ACATCGGAGA CGTCCAGGTT ATGGCTCATC TTCCTGA7GT 77GTG7GAAT 4260 

CTGATGAAAG CTCTAAAGAT GTCCCCATAC AAAGATATCC TAGAGACCCA TCTGAGAGAG 4320 

CAAGTGGACA GGAGCAGGCT GGCTGCTGTT GTGTCTGCC7 GTAAACAGCT TCACAGAGCT 4440 

GGGCTTCTGC ATAATATATT ACCGTCTCAG TCCACAGAT7 7GCA7CA77C TGTTGGCACA 4500 

GAACTTCTTT CCCTGGTTTA T AAAGGCAT T GCCCCTGGAG ATGAGAGACA G7GTC7GCCT 4560 

TCTCTAGACC TCAGTTGTAA GCAGCTGGCC AGCGGACTTC TGGAGTTAGC CT7TGCTTTT 4620 

GGAGGACTGT GTGAGCGCCT TGTGAGTCTT CTCCTGAACC CAGCGGTGCT GTCCACGGCG 4680 

TCCTTGGGCA GCT CACAGGG CAGCGTCATC CACTTCTCCC ATGGGGAGTA TTTCTATAGC 4740 

TTGTTCTCAG AAACG AT CAA CACGGAATTA TTGAAAAATC TGGATCTTGC TGTATTGGAG 480 0 

CTCATGCAGT CTTCAGTGGA TAATACCAAA ATGGTGAGTG CCGTTTTGAA CGGCA7GTTA 4860 

GACCAGAGCT TCAGGGAGCG AGCAAACCAG AAACACCAAG GACTGAAACT TGCGACTACA 4 920 

ATTCTGCAAC ACTGGAAGAA GTGTGATTCA TGGTGGGCCA AAGATTCCCC TCTCGAAACT 4980 

AAAATGGCAG TGCTGGCCTT ACTGGCAAAA ATTTTACAGA TTGATTCATC TGTATCTTTT 504 0 

AATACAAGTC ATGGTTCATT CCCTGAAGTC TTTACAACA7 ATATTAGTCT ACTTGCTGAC 5100 

ACAAAGCTGG ATCTACATTT AAAGGGCCAA GCTGTCACTC TTCTTCCATT CTTCACCAGC 5160 

CTCACTGGAG GCAGTCTGGA GGAACTTAGA CGTGTTCTGG AGCAGCTCAT CGTTGCTCAC 5220 

TTCCCCATGC AGTCCAGGGA ATTTCCTCCA GGAACTCCGC GGTTCAATAA TTATGTGGAC 5280 

TGCATGAAAA AGTTTCTAGA TGCATTGGAA TTATCTCAAA GCCCTATGTT GTTGGAATTG 534 0 

ATGACAGAAG TTCTTTGTCG GGAACAGCAG CATGTCATGG AAGAATTATT TCAATCCAGT 540 0 

TTCAGGAGGA TTGCCAGAAG GGGTTCATGT GTCACACAAG TAGGCCTTCT GGAAAGCGTG 5460 

TATGAAATGT TCAGGAAGGA TGACCCCCGC CTAAGTTTCA CACGCCAGTC CTTTGTGGAC 5520 

CGCTCCCTCC TCACTCTGCT GTGGCACTGT AGCCTGGATG CTTTGAGAGA ATTCTTCAGC 5580 

ACAATTGTGG TGGATGCCAT TGATGTGTTG AAGTCCAGGT TTACAAAGCT AAATGAATCT 5640 

ACCTTTGATA CTCAAATCAC CAAGAAGATG GGCTACTATA AGATTCTAGA CGTGATGTAT 5700 

TCTCGCCTTC CCAAAGATGA TGTTCATGCT AAGGAATCAA AAATTAATCA AGTTTTCCAT 5760 

GGCTCGTGTA TTACAGAAGG AAATGAACTT ACAAAGACAT TGATTAAATT GTGCTACGAT 5820 

GCATTTACAG AGAACATGGC AGGAGAGAAT CAGCTGCTGG AGAGGAGAAG ACTTTACCAT 5880 

TGTGCAGCAT ACAACTGCGC CATATCTGTC ATCTGCTGTG TCTTCAATGA GTTAAAATTT 5 940 

TACCAAGGTT TTCTGTTTAG TGAAAAACCA GAAAAGAACT TGCTTATTTT TGAAAATCTG 6000 

ATCGACCTGA AGCGCCGCTA TAATTTTCCT GTAGAAGTTG AGGTTCCTAT GGAAAGAAAG 6060 

AAAAAGTACA TTGAAATTAG GAAAGAAGCC AGAGAAGCAG CAAATGGGGA TTCAGATGGT 612 0 

CCTTCCTATA TGTCTTCCCT GTCATATTTG GCAGACAGTA CCCTGAGTGA GGAAATGAGT 618 0 

CAATTTGATT TCTCAACCGG AGTTCAGAGC TATTCATACA GCTCCCAAGA CCCTAGACCT 6240 

GCCACTGGTC GTTTTCGGAG ACGGGAGCAG CGGGACCCCA CGGTGCATGA TGATGTGCTG 63 0 0 

GAGCTGGAGA TGGACGAGCT CAATCGGCAT GAGTGCATGG CGCCCCTGAC GGCCCTGGTC 6360 

AAGCACATGC ACAGAAGCCT GGGCCCGCCT CAAGGAGAAG AGGATTCAGT GCCAAGAGAT 642 0 

CTTCCTTCTT GGATGAAATT CCTCCATGGC AAACTGGGAA ATCCAATAGT ACCATTAAAT 64 8 0 

ATCCGTCTCT TCTTAGCCAA GCTTGTTATT AATACAGAAG AGGTCTTTCG CCCTTACGCG 654 0 

AAGCACTGGC TTAGCCCCTT GCTGCAGCTG GCTGCTTCTG AAAACAATGG AGGAGAAGGA 6 60 0 

ATTCACTACA TGGTGGTTGA GATAGTGGCC ACTATTCTTT CATGGACAGG CTTGGCCACT 6 66 0 

CCAACAGGGG TCCCTAAAGA TGAAGTGTTA GCAAATCGAT TGCTTAATTT CCTAATGAAA 672 0 

CATGTCTTTC ATCCAAAAAG AGCTGTGTTT AGACACAACC TTGAAATTAT AAAGACCCTT 67 8 0 

GTCGAGTGCT GGAAGGATTG TTTATCCATC CCTTATAGGT TAATATTTGA AAAGTTTTCC 6840 

GGTAAAGATC CTAATTCTAA AGACAACTCA GTAGGGATTC AATTGCTAGG CATCGTGATG 6 900 

GCCAATGACC TGCCTCCCTA TGACCCACAG TGTGGCATCC AGAGTAGCGA ATACTTCCAG 6960 

GCTTTGGTGA ATAATATGTC CTTTGTAAGA TATAAAGAAG TGTATGCCGC TGCAGCAGAA 7020 

GTTCTAGGAC TTATACTTCG ATATGTTATG GAGAGAAAAA ACATACTGGA GGAGTCTCTG 708 0 

TGTGAACTGG TTGCGAAACA ATTGAAGCAA CATCAGAATA CTATGGAGGA CAAGTTTATT 7140 

GTGTGCTTGA ACAAAGTGAC CAAGAGCTTC CCTCCTCTTG CAGACAGGTT CATGAATGCT 7200 

GTGTTCTTTC TGCTGCCAAA ATTTCATGGA GTGTTGAAAA CACTCTGTCT GGAGGTGGTA 7260 

CTTTGTCGTG TGGAGGGAAT GACAGAGCTG TACTTCCAGT TAAAGAGCAA GGACTTCGTT 7320 

CAAGTCATGA GACATAGAGA TGATGAAAGA CAAAAAGTAT GTTTGGACAT AATTTATAAG 7380 

ATGATGCCAA AGTTAAAACC AGTAGAACTC CGAGAACTTC TGAACCCCGT TGTGGAATTC 7440 

GTTTCCCATC CTTCTACAAC ATGTAGGGAA CAAATGTATA ATATTCTCAT GTGGATTCAT 7500 

GATAATTACA GAG AT CCAGA AAGTGAGACA GATAATGACT CCCAGGAAAT ATTTAAGTTG 7560 

GCAAAAGATG TGCTGATTCA AGGATTGATC GATGAGAACC CTGGACTTCA ATTAATTATT 7620 

CGAAATTTCT GGAGCCATGA AACTAGGTTA CCTTCAAATA CCTTGGACCG GTTGCTGGCA 7 630 

CTAAATTCCT TATATTCTCC TAAGATAGAA GTGCACTTTT TAAGTTTAGC AACAAATTTT 7740 

CTGCTCGAAA TGACCAGCAT GAG CCCAG AT TATCCAAACC CCATGTTCGA GCATCCTCTG 7 800 

TCAGAATGCG AATTTCAGGA ATATACCATT GATTCTGATT GGCGTTTCCG AAGTACTGTT 7860 

CTCACTCCGA TGTTTGTGGA GACCCAGGCC TCCCAGGGCA CTCTCCAGAC CCGTACCCAG 7 920 

GAAGGGTCCC TCTCAGCTCG CTGGCCAGTG GCAGGGCAGA TAAGGGCCAC CCAGCAGCAG 7 980 

CATGACTTCA CACTGACACA GACTGCAGAT GGAAGAAGCT CATTTGATTG GCTGACCGGG 8040 

AGCAGCACTG ACCCGCTGGT CGACCACACC AGTCCCTCAT CTGACTCCTT GCTGTTTGCC 8100 

CACAAGAGGA GTGAAAGGTT ACAGAGAGCA CCCTTGAAGT CAGTGGGGCC TGATTTTGGG 8160 

AAAAAAAGGC TGGGCCTTCC AGGGGACGAG GTGGATAACA AAGTGAAAGG TGCGGCCGGC 822 0 

CGGACGGACC TACTACGACT GCGCAGACGG TTTATGAGGG ACCAGGAGAA GCTCAGTTTG 8280 

ATGTATGCCA GAAAAGGCGT TGCTGAGCAA AAACGAGAGA AGGAAATCAA GAGTGAGTTA 8340 

AAAATGAAGC AGGATGCCCA GGTCGTTCTG TACAGAAGCT ACCGGCA03G AGACCTTCCT 8400 

GACATTCAGA TCAAGCACAG CAGCCTCATC ACCCCGTTAC AGGCCGTG3C CCAGAGGGAC 8460 

CCAATAATTG CAAAACAGCT CTTTAGCAGC TTGTTTTC7G GAATTTTGAA AGAGATGGAT 8520 

AAATTTAAGA CACTGTCTGA AAAAAACAAC ATCACTCAAA AGTTGCTTCA AGACTTCAAT 8580 

CGTTTTCTTA ATACCACCTT CTCTTTCTTT CCACCCTTTG TCTCTTGTAT TCAGGACATT 8640 

AGCTGTCAGC ACGCAGCCCT GCTGAGCCTC GACCCAGCGG CTGTTAGCGC TGGTTGCCTG 8700 

GCCAGCCTAC AGCAGCCCGT GGGCATCCGC CTGCTAGAGG AGGCTCTGCT CCGCCTGCTG 8760 

CCTGCTGAGC TGCCTGCCAA GCGAGTCCGT GGGAAGGCCC GCCTCCCTCC TGATGTCCTC 8820 

AGATGGGTGG AGCTTGCTAA G CTGT AT AG A TCAATTGGAG AATACGACGT CCTCCGTGGG 8880 

ATTTTTACCA GTGAGATAGG AACAAAGCAA ATCACTCAGA GTGCATTATT AGCAGAAGCC 8 940 

AGAAGTGATT ATTCTGAAGC TGCTAAGCAG TATGATGAGG CTCTCAATAA ACAAGACTGG 9000 

GTAGATGGTG AGCCCACAGA AGCCGAGAAG GATTTTTGGG AACTTGCATC CCTTGACTGT 9060 
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TACAACCACC TTGCTGAGTG GAAATCACTT GAATACTGTT CTACAGCCAG TATAGACAGT 9120 
GAGAACCCCC CAGACCTAAA TAAAATCTGG AGTGAACCAT TTTATCAGGA AACATA7CTA 9180 
CCTTACATGA TCCGCAGCAA GCTGAAGCTG CTGCTCCAGG GAGAGGCTGA CCAGTCCCTG 924 0 
CTGACATTTA TTGACAAAGC TATGCACGGG GAGCTCCAGA AGGCGATTCT AGAGCTTCAT 9300 
TACAGTCAAG AGCTGAGTCT GCTTTACCTC CTGCAAGATG ATG7TGACAG AGCCAAATAT 9360 
TACATTCAAA ATGGCATTCA GAGTTTTATG CAGAATTATT CTAGTATTGA TGTCCTCTTA 942 0 
CACCAAAGTA GACTCACCAA ATTGCAGTCT GTACAGGCTT TAACAGAAAT TCA3GAGTTC 9480 
ATCAGCTTTA TAAGCAAACA AGGCAATTTA TCATCTCAAG TTCCCCTTAA GAGACTTCTG 9540 
AACACCTGGA CAAACAGATA TCCAGATGCT AAAATGGACC CAA7GAACAT CTGGGATGAC 9600 
ATCATCACAA ATCGATGTTT CTTTCTCAGC AAAATAGAGG AGAAGCTTAC CCCTCTTCCA 9660 
GAAGATAATA GTATGAATGT GGATCAAGAT GGAGACCCCA GTGACAGGAT GGAAGTGCAA 9720 
GAGCAGGAAG AAGAT AT CAG CTCCCTGATC AGGAGTTGCA AGTTTTCCAT GAAAATGAAG 9780 
ATGATAGACA GTGCCCGGAA GCAGAACAAT TTCTCACTTG CTATGAAACT ACTGAAGGAG 9840 
CTGCATAAAG AGTCAAAAAC CAGAGACGAT TGGCTGGTGA GCTGGGTGCA GAGCTACTGC 9900 
CGCCTGAGCC ACTGCCGGAG CCGGTCCCAG GGCTGCTCTG AGCAGGTGCT CACTGTGCTG 9960 
AAAACAGTCT CTTTGTTGGA TGAGAACAAC GTGTCAAGCT ACTTAAGCAA AAATATTCTG 1002 0 
GCTTTCCGTG ACCAGAACAT TCTCTTGGGT ACAACTTACA GGA7CATAGC GAATGCTCTC 10080 
AGCAGTGAGC CAGCCTGCCT TGCTGAAATC GAGGAGGACA AGGCTAGAAG AATCTTAGAG 10140 
CTTTCTGGAT CCAGTTCAGA GGATTCAGAG AAGGTGATCG CGGGTCTGTA CCAGAGAGCA 102 0 0 
TTCCAGCACC TCTCTGAGGC TGTGCAGGCG GCTGAGGAGG AGGCCCAGCC TCCCTCCTGG 10260 
AGCTGTGGGC CTGCAGCTGG GGTGATTGAT GCTTACATGA CGCTGGCAGA TTTCTGTGAC 10320 
CAACAGCTGC GCAAGGAGGA AGAGAATGCA TCAGTTATTG ATTCTGCAGA ACTGCAGGCG 10380 
TATCCAGCAC TTGTGGTGGA GAAAATGTTG AAAGCTTTAA AATTAAATTC CAATGAAGCC 1044 0 
AGATTGAAGT TTCCTAGATT ACTTCAGATT ATAGAACGGT ATCCAGAGGA GACTTTGAGC 10500 
CTCATGACAA AAGAGATCTC TTCCGTTCCC TGCTGGCAGT TCATCAGCTG GATCAGCCAC 10560 
ATGGTGGCCT TACTGGACAA AGACCAAGCC GTTGCTGTTC AGCACTCTGT GGAAGAAATC 10620 
ACTGATAACT ACCCGCAGGC TATTGTTTAT CCCTTCATCA TAAGCAGCGA AAGCTATTCC 10680 
TTCAAGGATA CTTCTACTGG TCATAAGAAT AAGGAGTTTG TGGCAAGGAT TAAAAGTAAG 10740 
TTGGATCAAG GAGGAGTGAT TCAAGATTTT ATTAATGCCT TAGATCAGCT CTCTAATCCT 10800 
GAACTGCTCT TTAAGGATTG GAGCAATGAT GTAAGAGCTG AACTAGCAAA AACCCCTGTA 10860 
AATAAAAAAA ACATTGAAAA AATGTATGAA AGAATGTATG CAGCCTTGGG TGACCCAAAG 10920 
GCTCCAGGCC TGGGGGCCTT TAGAAGGAAG TTTATTCAGA CTTTTGGAAA AGAATTTGAT 10 98 0 
AAACATTTTG GGAAAGGAGG TTCTAAACTA CTGAGAATGA AGCTCAGTGA CTTCAACGAC 11040 
ATTACCAACA TGCTACTTTT AAAAATGAAC AAAGACTCAA AGCCCCCTGG GAATCTGAAA 11100 
GAATGTTCAC CCTGGATGAG CGACTTCAAA GTGGAGTTCC TGAGAAATGA GCTGGAGATT 11160 
CCCGGTCAGT ATGACGGTAG GGGAAAGCCA TTGCCAGAGT ACCACGTGCG AATCGCCGGG 11220 
TTTGATGAGC GGGTGACAGT CATGGCGTCT CTGCGAAGGC CCAAGCGCAT CATCATCCGT 11280 
GGCCATGACG AGAGGGAACA CCCTTTCCTG GTGAAGGGTG GCGAGGACCT GCGGCAGGAC 11340 
CAGCGCGTGG AGCAGCTCTT CCAGGTCATG AATGGGATCC TGGCCCAAGA CTCCGCCTGC 11400 
AGCCAGAGGG CCCTGCAGCT GAGGACCTAT AGCGTTGTGC CCATGACCTC CAGGTTAGGA 11460 
TTAATTGAGT GGCTTGAAAA TACTGTTACC TTGAAGGACC TTCTTTTGAA CACCATGTCC 11520 
CAAGAGGAGA AGGCGGCTTA CCTGAGTGAT CCCAGGGCAC CGCCGTGTGA ATATAAAGAT 11580 
TGGCTGACAA AAATGTCAGG AAAACATGAT GTTGGAGCTT ACATGCTAAT GTATAAGGGC 11640 
GCTAATCGTA CTGAAACAGT CACGTCTTTT AGAAAACGAG AAAGTAAAGT GCCTGCTGAT 117 00 
CTCTTAAAGC GGGCCTTCGT GAGGATGAGT ACAAGCCCTG AGGCTTTCCT GGCGCTCCGC 11760 
TCCCACTTCG CCAGCTCTCA CGCTCTGATA TGCATCAGCC ACTGGATCCT CGGGATTGGA 11820 
GACAGACATC TGAACAACTT TATGGTGGCC ATGGAGACTG GCGGCGTGAT CGGGATCGAC 11880 
TTTGGGCATG CGTTTGGATC CGCTACACAG TTTCTGCCAG TCCCTGAGTT GATGCCTTTT 11940 
CGGCTAACTC GCCAGTTTAT CAATCTGATG TTACCAATGA AAGAAACGGG CCTTATGTAC 12 00 0 
AGCATCATGG TACACGCACT CCGGGCCTTC CGCTCAGACC CTGGCCTGCT CACCAACACC 12060 
ATGGATGTGT TTGTCAAGGA GCCCTCCTTT GATTGGAAAA ATTTTGAACA GAAAATGCTG 1212 0 
AAAAAAGGAG GGTCATGGAT TCAAGAAATA AATGTTGCTG AAAAAAATTG GTACCCCCGA 12180 
CAGAAAATAT GTTACGCTAA GAGAAAGTTA GCAGGTGCCA ATCCAGCAGT CATTACTTGT 12240 
GATGAGCTAC TCCTGGGTCA TGAGAAGGCC CCTGCCTTCA GAGACTATGT GGCTGTGGCA 12300 
CGAGGAAGCA AAGATCACAA CATTCGTGCC CAAGAACCAG AGAGTGGGCT TTCAGAAGAG 12360 
ACTCAAGTGA AGTGCCTGAT GGACCAGGCA ACAGACCCCA ACATCCTTGG CAGAACCTGG 12420 
GAAGGATGGG AGCCCTGGAT GTGAGGTCTG TGGGAGTCTG CAGATAGAAA GCATTACATT 12 480 
GTTTAAAGAA TCTACTATAC TTTGGTTGGC AGCATTCCAT GAGCTGATTT TCCTGAAACA 12540 
CTAAAGAGAA ATGTCTTTTG TGCTACAGTT TCGTAGCATG AGTTTAAATC AAGATTATGA 12600 
TGAGTAAATG TGTATGGGTT AAATCAAAGA TAAGGTTATA GTAACATCAA AGATTAGGTG 126 6 0 
AGGTTTATAG AAAGATAGAT ATCCAGGCTT ACCAAAGTAT TAAGTCAAGA ATATAATATG 12720 
TGATCAGCTT TCAAAGCATT TACAAGTGCT GCAAGTTAGT GAAACAGCTG TCTCCGTAAA 12780 
TGGAGGAAAT GTGGGGAAGC CTTGGAATGC CCTTCTGGTT CTGGCACATT GGAAAGCACA 12840 
CTCAGAAGGC TTCATCACCA AGATTTTGGG AGAGTAAAGC TAAGTATAGT TGATGTAACA 12 900 
TTGTAGAAGC AGCATAGGAA CAATAAGAAC AATAGGTAAA GCTATAATTA TGGCTTATAT 12 960 
TTAGAAATGA CTGCATTTGA TATTTTAGGA TATTTTTCTA GGTTTTTTCC TTTCATTTTA 13 020 
TTCTCTTCTA GTTTTGACAT TTTATGATAG ATTTGCTCTC TAGAAGGAAA CGTCTTTATT 13080 
TAGGAGGGCA AAAATTTTGG TCATAGCATT CACTTTTGCT ATTCCAATCT ACAACTGGAA 13140 
GATACATAAA AGTGCTTTGC ATTGAATTTG GGATAACTTC AAAAATCCCA TGGTTGTTGT 13200 
TAGGGATAGT ACTAAGCATT TCAGTTCCAG GAGAATAAAA GAAATTCCTA TTTGAAATGA 13260 
ATTCCTCATT TGGAGGAAAA AAAGCATGCA TTCTAGCACA ACAAGATGAA ATTATGGAAT 13320 
ACAAAAGTGG CTCCTTCCCA TGTGCAGTCC CTGTCCCCCC CCGCCAGTCC TCCACACCCA 13380 
AACTGTTTCT GATTGGCTTT TAGCTTTTTG TTGTTTTTTT TTTTCCTTCT AACACTTGTA 13 440 
TTTGGAGGCT CTTCTGTGAT TTTGAGAAGT ATACTCTTGA GTGTTTAATA AAGTTTTTTT 13500 



I I I I I I 

MAGSGAGVRC SLLRLQETLS AADRCGAALA G-ICL T P jL SSSPAV LALQ 

RDFGLLVPVR KSLNSIEFRE CREEILKPLC IFLEKMGQKI APYSVEIKNT CTSVYTKDRA 

AKCKIPALDL LIKLLQTFRS SRLMDEFKIG ELFSKFYGEL ALKKKIPDTV L3KVYELLGL 

LGEVHPSEMI NNAENLFRAF LGELKTQMTS AVREPKLPVL AGCLKGLSSL LCNFTKSMEE 

DPQTSREIFN FVLKAIRPQI DLKRYAVPSA GLRLFALHAS QFSTCLLDMY VSLFEVLLKW 

CAHTNVELKK AALSALESFL KQV SNMVAKN AEMHKNKLQY FMEQFYGIIR NVDS1INKELS 
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IAIRGYGLFA GPCKVINAKD VDFMYVELIQ RCKQMPLTQT DTGDDRVYQM PSFLQSVASV 420 

TWHQGLIRI CSKPWLPKG PESESEDHRA SGEVR7GKKK VPTYKDYVDL FRHLLSSDQM 540 

MDSILADEAF FSVNSSSESL NHLLYDEFVK SVLKIVEKLD LTLEIQTVGS QENGDEAPGV 600 

5 WMIPTSDPAA NLHPAKPKDF SAFINLVEFC REILPEKQAE FFEPWVYSFS YELILQSTRL 660 

PLISGFYKLL SITVRNAKKI KYFEGVSPKS LKHSPEDPEK YSCFALFVKF GKEVAVKMKQ 720 

YKDELLASCL TFLLSLPHNI IELDVRAYVP ALQMAFKLGL SYTPLAEVGL NALEEWSIYI 780 

DRHVMQPYYK DILPCLDGYL KTSALSDETK NNWEVSALSR AAQKGFNKW LKHLKKTKNL 840 

SSNEAISLEE IRIRWQMLG SLGGQINKNL LTVTSSDBKM KSYVAWDREK RLSFAVPFRE 900 

10 MKPVIFLDVF LPRVTELALT ASDRQTKVAA CELLHSMVKF MLGKATQMPS GGQGAPPMYQ 960 

LYKRTFPVLL RLACDVDQVT RQLYEPLVMQ LIHWFTNNKK FESQDTVALL EAILDGIVDP 1020 

VDSTLRDFCG RCIREFLKWS IKQITPQQQE KSPVNTKSLF KRLYSLALH? NAFKRLGASL 1080 

AFNNIYREFR EEESLVEQFV FEALVIYMES LALAHADEKS LGTIQQCCDA IDHLCRIIEK 1140 

KHVSLNKAKK RRLPRGFPPS ASLCLLDLVK WLLAHCGRPQ TECRHKSIEL FYKFVPLLPG 1200 

15 NRSPNLWLKD VLKEEGVSFL INTFEGGGCG QPSGILAQPT LLYLRGPFSL QATLCWLDLL 1260 

LAALECYNTF IGERTVGALQ VLGTEAQSSL LKAVAFFLES IAMHDIIAAE KCFGTGAAGN 1320 

RTSPQEGERY NYSKCTWVR IMEFTTTLLN TSPEGWKLLK KDLCNTKLMR VLVQTLCEPA 1380 

SIGFNIGDVQ VMAHLPDVCV NLMKALKMSP YKDILETHLR EKITAQSIEE LCAVNLYGPD 1440 

AQVDRSRLAA WSACKQLHR AGLLHNILPS QSTDLHHSVG TELLSLVYKG IAPGDERQCL 1500 

20 PSLDLSCKQL ASGLLELAFA FGGLCERLVS LLLNPAVLST ASLGSSQGSV IHFEHGEYFY 1560 

SLFSETINTE LLKNLDLAVL ELMQSSVDNT KMVSAVLNGM LDQSFRERAN QKHQGLKLAT 1620 

TILQHWKKCD SWWAKDSPLE TKMAVLALLA KILQIDSSVS FNTSHGSFP3 VFTTYISLLA 1680 

DTKLDLHLKG QAVTLLPFFT SLTGGSLEEL RRVLEQLIVA HFPMQSREFP PGTPRFNNYV 1740 

DCMKKFLDAL ELSQSPMLLE LMTEVLCREQ QHVMEELFQS SFRRIARRGS CVTQVGLLES 1800 

25 VYEMFRKDDP RLSFTRQSFV DRSLLTLLWH CSLDALREFF STIWDAIDV LKSRFTKLNE 1860 

STFDTQITKK MGYYKILDVM YSRLPKDDVH AKESKINQVF HGSCITEGNE LTKTLIKLCY 1920 

DAFTENMAGE NQLLERRRLY HCAAYNCAIS VICCVFKELK FYQGFLFSEK PEKNLLIFEN 1980 

LIDLKRRYNF PVEVEVPMER KKKYIEIRKE AREAANGDSD GPSYMSSLSY LADSTLSEEM 2040 

SQFDFSTGVQ SYSYSSQDPR PATGRFRRRE QRDPTVHDDV LELEMDELNR HECKAPLTAL 2100 

30 VKHMHRSLGP PQGEEDSVPR DLPSWMKFLH GKLGNPIVPL NIRLFLAKLV INTEEVFRPY 2160 

AKHWLSPLLQ LAASENNGGE GIHYMWEIV ATILSWTGLA TPTGVPKDEV LANRLLNFLM 2220 

KHVFHPKRAV FRHNLEIIKT LVECWKDCLS IPYRLIFEKF SGKDPNSKDM SVGIQLLGIV 2280 

MANDLPPYDP QCGIQSSEYF QALVNNMSFV RYKEVYAAAA EVLGLILRYV MERKNILEES 2340 

LCELVAKQLK QHQNTMEDKF IVCLNKVTKS FPPLADRFMN AVFFLLPKFH GVLKTLCLEV 2400 

35 VLCRVEGMTE LYFQLKSKDF VOVMRHRDDE RQKVCLDIIY KMMPKLKPVE LRELLNPWE 2460 

FVSHPSTTCR EQMYNILMWI HDNYRDPESE TDNDSQEIFK LAKDVLIQGL IDENPGLQLI 2520 

IRNFWSHETR LPSNTLDRLL ALNSLYSPKI EVHFLSLATN FLLEMTSMSP DYPNPMFEHP 2580 

LSECEFQEYT IDSDWRFRST VLTPMFVETQ ASQGTLQTRT QEGSLSARKP VAGQIRATQQ 2640 

QHDFTLTQTA DGRSSFDHLT GSSTDPLVDH TSPSSDSLLF AHKRSERLQR APLKSVGPDF 2700 

40 GKKRLGLPGD EVDNKVKGAA GRTDLLRLRR RFMRDQEKLS LMYARKGVAE QKREKEIKSE 2760 

LKMKQDAQW LYRSYRHGDL PDIQIKHSSL ITPLQAVAQR DPIIAKQLFS SLFEGILKEM 2820 

DKFKTLSEKN NITQKLLQDF NRFLNTTFSF FPPFVSCIQD ISCQHAALLS LDPAAVSAGC 2 880 

LASLQQPVGI RLLEEALLRL LiPAELPAKRV RGKARLPPDV LRWVELAKLY RS IGEYDVLR 2 940 

GIFTSEIGTK QITQSALLAE ARSDYSEAAK QYDEALNKOD WVDGEPTEAE KDFKELAELD 3000 

45 CYNHLAEWKS LEYCSTASID SENPPDLNKI WSEPFYQETY LPYMIREKLK LLLQGEADQS 3060 

LLTFIDKAMH GELQKAI LEL HYSQELSLLY LLQDDVDRAK YYIQNGIQSF MQNYSSIDVL 3120 

LHQSRLTKLQ SVQALTEIQE FISFISKQGN LSSQVPLKRL LNTWTNRYPD AKMEPMNIWD 3180 

DIITNRCFFL SKIEEKLTPL PEDMSMN\'DQ DGDPSDRMEV QEQEEDISSL IRSCKFSMKM 3240 

KMIDSARKQN NFSLAMKLLK ELHKESKTRD DWLVSWVQEY CRLSHCRSRS QGCEEQVLTV 3300 

50 LKTVSLLDEM NVSSYLSKNI LAFRDQNI LL GTTYRIIANA LESEPACLAE IEECKARRIL 3360 

ELSGSSSEDS EKVIAGLYQR AFQHLSEAVQ AAEEEAQPPS WECGPAAGVI DAYMTLADFC 3420 

DQQLRKEEEN ASVIDSAELQ AYPALWEKM LKALKLNSNE ARLKFPRLLQ IIERYPEETL 3480 

SLMTKEISSV PCWQFISWIS HMVALLDKDQ AVAVQHSVEE ITDNYPQAIV YPFIISSESY 3540 

SFKDTSTGHK NKEFVARIKS KLDQGGVIQD FINALDQLSN PELLFKDWSN DVRAELAKTP 3600 

55 VNKKNIEKMY ERMYAALGDP KAPGLGAFRR KFIQTFGKEF DKKFGKGGSK LLRMKLSDFN 3660 

DITMMLLLKM NKDSKPPGNL KECSPWMSDF KVEFLRNELE IPGQYDGRGK PLPEYHVRIA 3 72 0 

GFDERVTVMA SLRRPKRIII RGHDEREHPF LVKGGEDLRQ DQRVEQLFQV MNGILAQDSA 3780 

CSQRALQLRT YSWPMTSRL GLIEWLEKTV TLKDLLLNTM SQEEKAAYLS DPRAPPCEYK 3 840 

DWLTKMSGKH DVGAYMLMYK GANRTETVTS FRKRESKVPA DLLKRAFVRM STSPEAFLAL 3 900 

60 RSHFASSHAL ICISHWILGI GDRHLNNFMV AMETGGVIGI DFGHAFGSAT QFLPVPELMP 3960 

FRLTRQFINL MLPMKETGLM YSIMVHAI/RA FRSDPGLLTN TMDVFVKEPS FDWKNFEQKM 4020 

LKKGGSWIQE INVAEKNWYP RQKICYAKRK LAGANPAVIT CDELLLGHEK APAFRDYVAV 4080 

ARGSKDHNIR AQEPESGLSE ETQVKCLMDQ ATDPNI LGRT WEGWEPWM 

65 Seq ID NO: 100 DNA sequence 
Coding sequence: 101-1225 

70 t r • r r r r 

ATGTGAAGGC ACAAGCTGCT GTTATATACA ACAGAGTGAA CTGAGCATCA GTCAGAAAAA 60 

GTCTATGTTT GCAGAAATAC AGATCCAAGA CAAAGACAGG ATGGGCACTG CTGGAAAAGT 120 

TATTAAATGC AAAGCAGCTG TGCTTTGGGA GCAGAAGCAA CCCTTCTCCA TTGAGGAAAT 180 

AGAAGTTGCC CCACCAAAGA CTAAAGAAGT TCGCATTAAG ATTTTGGCCA CAGGAATCTG 240 

75 TCGCACAGAT GACCATGTGA TAAAAGGAAC AATGGTGTCC AAG7TTCCAG TGATTGTGGG 300 

ACATGAGGCA ACTGGGATTG TAGAGAGCAT TGGAGAAGGA GTGACTACAG TG.AAACCAGG 360 

TGACAAAGTC ATCCCTCTCT TTCTGCCACA ATGTAGAGAA TGCAATGCTT GTCGCAACCC 420 

AGATGGCAAC CTTTGCATTA GGAGCGATAT TACTGGTCGT GGAGTACTGG CTGATGGCAC 480 

CACCAGATTT ACATGCAAGG GCAAACCAGT ACACCACTTC ATGAACACCA GTACATTTAC 540 

80 CGAGTACACA GTGGTGGATG AATCTTCTGT TGCTAAGAT7 GATGATGCAG CTCCTCCTGA 600 

GAAAGTCTGT TTAATTGGCT GTGGGTTTTC CACTGGATAT GGGGCTGCTG TTAAAACTGG 660 

CAAGGTCAAA CCTGGTTCCA CTTGCGTCGT CTTTGGCCTG GGAGGAG7TG GCCTGTCAGT 720 

CATCATGGGC TGTAAGTCAG CTGGTGCATC TAGGATCAT7 GGGATTGACC TCAACAAAGA 780 

CAAATTTGAG AAGGCCATGG CTGTAGGTGC CACTGAGTGT ATCAGTCCCA AGGACTCTAC 840 

85 CAAACCCATC AGTGAGGTGC TGTCAGAAAT GACAGGCAAC AACGTGGGAT ACACCTTTGA 900 

AGTTATTGGG CATCTTGAAA CCATGATTGA TGCCCTGGCA TCCTGCCACA TGAACTATGG 960 

GACCAGCGTG GTTGTAGGAG TTCCTCCATC AGCCAAGATG CTCACCTATG ACCCGATGTT 1020 
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GCTCTTCACT GGACG CACAT GGAAGGGATG TGTCTTTGGA GGTTTGAAAA GCAGAGATGA 1080 
TGTCCCAAAA CTAGTGACTG AGTTCCTGGC AAAGAAATTT GACCTGGACC AGT7GATAAC 1140 
TCATGTTTTA CCATTTAAAA AAATCAGTGA AGGATTTGAG CTGCTCAATT CAGGACAAAG 1200 
CATTCGAACG GTCCTGACGT TTTGAGATCC AAAGTGGCAG GAGGTCTGTG TTG7CATGGT 1260 
GAACTGGAGT TTCTCTIGTG AGAGTTCCCT CATCTGAAAT CATGTATCTG TCTCACAAAT 1320 
ACAAGCATAA GTAGAAGATT TGTTGAAGAC ATAGAACCCT TATAAAGAAT TATTAACCTT 1380 
TATAAACATT TAAAGTCTTG TGAGCACCTG GGAATTAGTA TAATAACAAT GTTAATATTT 1440 
TTGATTTACA TTTTGTAAGG CTATAATTGT ATCTTTTAAG AAAACATACA CTTGGATTTC 1500 
TATGTTGAAA TGGAGATTTT TAAGAGTTTT AACCAGCTGC TGCAGATATA TAACTCAAAA 1560 
CAGATATAGC GTATAAAGAT ATAGTAAATG CATCTCCCAG AGTAATATTC ACTTAACACA 1620 
TTGAAACTAT TATTTTTTAG ATTTGAATAT AAATGTATTT T7TAAACACT TGTTATGAGT 1680 
TAACTTGGAT TACATTTTGA AATCAGTTCA TTCCATGATG CATATTACTG GATTAGATTA 1740 
AGAAAGACAG AAAAGATTAA GGGACGGGCA CATTTTTCAA CGATTAAGAA TCATCATTAC 1800 
ATAACTTGGT GAAACTGAAA AAGTATATCA TATGGGTACA CAAGGCTATT TGCCAGCATA 1860 
T ATAAACATAG AGCTAGAGTC """" 



ACTTATCATA ATGTTCAATT TGATACAGTA GAATTGCAAG TCCCTAAGTC 1980 

CCTATTCACT GTGCTTAGTA GTGACTCCAT TTAATAAAAA GTGTTTTTAG TTTTTAACAA 2040 
CTAAACCG 



1 11 21 31 41 51 

I I I I I 

MGTAGKVIKC KAAVLWEQKQ PPSIEEIEVA PPKTKEVRIK ILATGICRTD DKVIKGTMVS 
KFPVIVGHEA TGIVESIGEG VTTVKPGDKV IPLPLPQCRE CNACRNPDGN LCIRSDITGR 
GVLADGTTRF TCKGKPVHHF MNTSTFTEYT WDESSVAKI DDAAPP3KVC LIGCGFSTGY 
GAAVKTGKVK PGSTCWFGL GGVGLSVIMG CKSAGASRII GIDLNKDKFE KAMAVGAT3C 
ISPKDSTKPI SEVLSEMTGK NVGYTFEVIG HLETMIDALA SCHMNYGTSV WGVPPSAKM 
LTYDPMLLFT GRTWKGCVFG GLKSRDDVPK LVTEFLAKKF DLDOLITHVL PFKKISEGPE 
LLNSGQSIRT VLTF 
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ATGGATTGGG GGACGCTGCA CACTTTCATC GGGGGTGTCA ACAAACACTC CACCAGCATC 
GGGAAGGTGT GGATCACAGT CATCTTTATT TTCCGAGTCA TGATCCTAGT GGTGGCTGCC 
CAGGAAGTGT GGGGTGACGA GCAAGAGGAC TTCGTCTGCA ACACACTGCA ACCGGGATGC 
AAAAATGTGT GCTATGACCA CTTTTTCCCG GTGTCCCACA TCCGGCTGTG GGCCCTCCAG 
TGATI ~rrG TCTCCACCCC AGCGCTGCTG GTGGCCATGC ATGTGGCCTA CTACAGGCAC 
GAAACCACTC GCAAGTTCAG GCGAGGAGAG AAGAGGAATG ATTTCAAAGA CATAGAGGAC 
ATTAAAAAGC ACAAGGTTCG GATAGAGGGG TCGCTGTGGT GGACGTACAC CAGCAGCATC 
TTTTTCCGAA TCATCTTTGA AGCAGCCTTT ATGTATGTGT TTTACTTCCT TTACAATGGG 
TACCACCTGC CCTGGGTGTT GAAATGTGGG ATTGACCCCT GCCCCAACCT TGTTGACTGC 
TTTATTTCTA GGCCAACAGA GAAGACCGTG TTTACCATTT TTATGATTTC TGCGTCTGTG 
ATTTGCATGC TGCTTAACGT GGCAGAGTTG TGCTACCTGC TGCTGAAAGT GTGTTTTAGG 
AGATCAAAGA GAGCACAGAC GCAAAAAAAT CACCCCAATC ATGCCCTAAA GGAGAGTAAG 
CAGAATGAAA TGAATGAGCT GATTTCAGAT AGTGGTCAAA ATGCAATCAC AGGTTTCCCA 



1 11 21 31 41 51 

I I I I I 

MDWGTLHTFI GGVNKHSTSI GKVWITVIFI FRVMILWAA QEVWGDEQED FVCNTLQPGC 60 

KNVCYDHFFP VSHIRLWALQ LIFVSTPALL VAMHVAYYRH ETTRKFRRGE KRNDFKDIED 120 

IKKHKVRIEG SLWWTYTSSI FFRIIFEAAF MYVFYFLYNG YHLPWVLKCG IDPCPNLVDC 180 

FISRPTEKTV FTIFMISASV ICMLLNVAEL CYLLLKVCFR RSKRAQTQKN KPNHALKESK 240 
QNEMNELISD SGQNAITGFP S 

Seq ID NO: 104 DNA sequence 
Nucleic Acid Accession #: NM_020411 
Coding sequence: 86-526 

I I' I' I' I' I' 

GGACCTGGGA AGGAGCATAG GACAGGGCAA GGCGGGATAA GGAGGGGCAC CACAGCCCTT 60 

AAGGCACGAG GGAACCTCAC TGCGCATGCT CCTTTGGTGC CCACCTCAGT GCGCATGTTC 120 

ACTGGGCGTC TTCCCATCGG CCCCTTCGCC AGTGTGGGGA ACGCG3CGGA GCTGTGAGCC 180 

GGCGACTCGG GTCCCTGAGG TCTGGATTCT TTCTCCGCTA CTGAGACACG GCGC-ACACAC 240 

ACAAACACAG AACCACACAG CCAGTCCCAG GAGCCCAGTA ATGGAGAGCC CCAAAAAGAA 300 

GAACCAGCAG CTGAAAGTCG GGATCCTACA CCTGGGCAGC AGACAGAAGA AGATCAGGAT 3 60 

ACAGCTGAGA TCCCAGTGCG CGACATGGAA GGTGATCTGC AAGAGCTGCA TCAGTCAAAC 420 

ACCGGGGATA AATCTGGATT TGGGTTCCGG CGTCAAGGTG AAGATAA7AC CTAAAGAGGA 480 

ACACTGTAAA ATGCCAGAAG CAGGTGAAGA GCAACCACAA GTTTAAATGA AGACAAGCTG 540 

AAACAACGCA AGCTGGTTTT ATATTAGATA TTTGACTTAA ACTATCTCAA TAAAGTTTTG 6 00 
CAGCTTTCAC G 
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MLLWCPPQCA CSLGVFPSAP SPVWGTRRSC EPATRVPBVW ILSPLLRHGG IITQTQNHTAS 60 
PRSPVMESPK KKNQQLKVGI LHLGSRQKKI RIQLRSQCAT WKVICKSCIS QTPGINLDLG 120 
SGVKVKIIPK EEHCKMPEAG EEQPQV 



CCACAGCCGC AGCCATGCTG TGCCTCCTGC 



c Acid Accession")* : J 
sequence: 99-587 



AGGACCTGGA GCTCCCAAAG TTGGCAGGGA CCTGGCACTC CATGGCCAT3 GCGACCAACA 
ACATCTCCCT CATGGCGACA CTGAAGGCCC CTCTGAGGGT CCACATCACC TCACTGTTGC 
CCACCCCCGA GGACAACCTG GAGATCGTTC TGCACAGATG GGAGAACAAC AGCIGTGTTG 
AGAAGAAGGT CCTTGGAGAG AAGACTGGGA ATCCAAAGAA GTTCAAGATC AACTATACGG 
TGGCGAACGA GGCCACGCIG CTCGATACTG ACTACGACAA TTTCCTGTTT CTCTGCCTAC 
AGGACACCAC CACCCCCATC CAGAGCATGA TGTGCCAGTA CCTGGCCAGA GTCCTGGTGG 
AGGACGATGA GATCATGCAG GGATTCATCA GGGCTTTCAG GCCCCTGCCC AGGCACCTAT 
GGTACTTGCT GGACTTGAAA CAGATGGAAG AGCCGTGCCG TTTCTAGCTC ACCTCCGCCT 
CCAGGAAGAC CAGACTCCCA CCCTTCCACA CCTGCAGAGC AGTGGGACTT CCTCCTGCCC 
TTTCAAAGAA TAACCACAGC TCAGAAGACG ATGACGTGGT CATCTGTGTC GCCATCCCCT 
TCCTGCTGCA CACCTGCACC ATTGCCATGG GGAGGCTGCT CCCTGGGGGC AGAGTCTCTG 
GCAGAGGTTA TTAATAAACC CTTGGAGCAT G 



30 

i t t i 1 r r 

MDIPQTKQDL ELPKLAGTWH SMAMATNNIS LMATLKAPLR VHITSLLPTP EDNLEIVLHR 
WENNSCVEKK VLGEKTGNPK KFKINYTVAN EATLLDTDYD NFLFLCLQDT TTPIQSMMCQ 
35 YLARVLVEDD EIMQGFIRAF RPLPRHLWYL LDLKQMEEPC RF 



Seq ID NO; 108 DNA sequence 
Nucleic Acid Accession it: Eos sequence 
Coding sequence-. 48-794 

1 11 21 31 41 51 

I 1 I I I I 

TCCCAGGCAG CAGTTAGCCC GCCGCCCGCC TGTGTGTCCC CAGAGCCATG GAGAGAGCCA 
GTCTGATCCA GAAGGCCAAG CTGGCAGAGC AGGCCGAACG CTATGAGGAC ATGGCAGCCT 
TCATGAAAGG CGCCGTGGAG AACGGCGAGG AGCTCTCCTG CGAAGAGCGA AACCTGCTCT 
CAGTAGCCTA TAAGAACGTG GTGGGCGGCC AGAGGGCTGC CTGGAGGGTG CTGTCCAGTA 
'i GAGCAGAA AAGCAACGAG GAGGGCTCGG AGGAGAAGGG GCCCGAGGTG CGTGAGTACC 
GGCAGAACGT GGAGACTGAG CTCCAGGGCG TGTGCGACAC CGTGCTC 
GCCACCTCAT CAAGGAGGCC GGGGACGCCG AGAGCCGGGT CTTCTACCTG A 
GTGACTACTA CCGCTACCTG GCCGAGGTGG CCACCGGTGA CGACAAGAAG CGCATCATTG 430 
ACTCAGCCCG GTCAGCCTAC CAGGAGGCCA TGGACAT C AG CAAGAAGGAG ATGCCGCCCA 540 
CCAACCCCAT CCGCCTGGGC CTGGCCCTGA ACTTTTCCGT CTTCCACTAC GAGATCGCCA 600 
ACAGCCCCGA GGAGGCCATC TCTCTGGCCA AGACCACTTT CGACGAGGCC ATGGCTGATC 660 
TGCACACCCT CAGCGAGGAC TCCTACAAAG ACAGCACCCT CATCATGCAG CTGCTGCGAG 720 
ACAACCTGAC ACTGTGGACG GCCGACAACG CCGGGGAAGA GGGGGGCGAG GCTCCCCAGG 780 
AGCCCCAGAG CTGAGTGTTG CCCGCCACCG CCCCGCCCTG CCCCCTCCAG TCCCCCACCC 840 
TGCCGAGAGG ACTAGTATGG GGTGGGAGGC CCCACCCTTC TCCCCTAGGC GCTGTTCTTG 900 
CTCCAAAGGG CTCCGTGGAG AGGGACTGGC AGAGCTGAGG CCACCTGGGG CTGGGGATCC 960 
CACTCTTCTT GCAGCTGTTG AGCGCACCTA ACCACTGGTC ATGCCCCCAC CCCTGCTCTC 1020 
CGCACCCGCT TCCTCCCGAC CCCAGGACCA GGCTACITCT CCCCTCCTCT TGCCTCCCTC 1080 
CTGCCCCTGC TGCCTCTGAT CGTAGGAATT GAGGAGTGTC CCGCCTTGIG GCTGAGAACT 1140 
GGACAGTGGC AGGGGCTGGA GATGGGTGTG TGTGTGTGTG TGTGTGTGTG TGTGTG1GTG 1200 
CGCGCGCGCC AGTGCAAGAC CGAGATTGAG GGAAAGCATG TCTGCTGGGT GTGACCATGT 1260 
TTCCTCTCAA TAAAGTTCCC CTGTGACACT C 

seq ID NO: 109 Protein sequence: 
Protein Accession Si NP_006133.1 

1 11 21 31 41 51 

I I I I I I 

MERASLIQKA KLAEQAERYE DMAAFMKGAV EKGEELSCEE RNLLEVAYKN WGGQRAAWR 60 
VLSSIEQKSN EEGSEEKGPE VREYREKVET ELQGVCDTVL GLLDSHLIKE AGDAESRVFY 120 
LKMKGDYYRY LAEVATGDDK KRIIDSARSA YQEAMDISKK EMPPTNPIRL GLALNFSVFH 1B0 
YEIANSPEEA ISLAKTTFDE AMADLHTLSE DSYKDSTLIM QLL3DNLTLW TADNAGEEGG 240 
EAPQEPQS 

Seq ID NO: 110 DNA sequence 
Nucleic Acid Accession # : NM_000695 
Coding sequence: 407-1564 

1 11 21 31 41 51 

I I I I I I 

CACGAGTTGG TTTGGGAGCT GCCAGTCTCC TGG - ATC . ^GTCAGCA GAGCAGGGCT 60 

GAGGCCTGGG GGTAGGAGCA GAGCCTGCGC ATCTGGAGGC AGCATGTCCA AGAAAGGGAG 120 

TGGAGGTGCA GCGAAGGACC CAGGGGCAGA GCCCACGCTG GGGATGGACC CCT7CGAGGA 180 

CACACTGCGG CGGCTGCGTG AGGCCTTCAA CTGAGGGCGC ACGCGGCCGG CCGAGTTCCG 240 

GGCTGCGCAG CTCCAGGGCC TGGGCCACTT CCTTCAAGAA AACAAGCAGC TTCTGCGCGA 300 



230 



30 
35 
40 
45 
50 
55 
60 
65 
70 
75 



WO 02/086443 

CGTGCTGGCC CAGGACCTGC 
TTGCCAGAAC GAGGTTGACT 
ACGGTCCACG AACCTGTTCA 
CCTGGTCCTC ATCATCGCAC 
GGGCACCCTC CCCGCAGGGA 
AGAGAAGGTC CTGGCTGAGG 
GCTGGGCGGA CCCCAGGAGA 
C CCTCGTGTGG 



ATAAGCCAGC TTTCGAGGCA GACATATCTG AGCTCATCCT 
ACGCTCTCAA GAACCTTCAG GCCTGGATGA AGGATGAACC 
TGAAGCTGGA CTCGGTCTTC ATCTGGAAGG AACCCTTTGG 
CCTGGAACTA CCCATTGAAC CTGACCCTGG TGCTCCTGGT 
GCTGAAGCCG TCAGAAATCA GCCAGGGCAC 



CAGGGCAGCT GCTAGAGCAC AAG7TGGACT ACATCT7CTT 
GCAAGATTGT CATGACTGCT GCCACCAAGC ACCTGACGCC 
GCGACCCCCA 

CTACTTCAAT GCCGGCCAGA CCTGCGTGGC 
GCCCCGAGAT GCAGGAGAGG CTGCTGCCCG CCCTGCAGAG 



PCT/US02/12476 



GAGCGTGGAC 
TGGAGGCAAT 
CACCTGCCTG 



CAGTTCCAGC 
AACGAGAGCG 

GTGATGCAGG 



GGCTGCGGGC 



AGCAGACAGG T 



ATGGGCCGGT 



GCTTATGCTC 
TGGAGCTGTC 
TCTGGGGGAC 



AACCAGCAGC 
CACCCGCCTC 
CCAACTCACA 
ACATGACTGC 



CACCCCACCC 
CACAGGGGCA 
GAACGGTTGA 
TTCCACCTCT 
CCCACACTGG 
AGCTCCATCC 
CTGGGGGCAA 
CCAAAATGGA 



TCCCCAATTC 
GTGTCACCCT 
GAGCGTGGAG 
GCCCCATCCC 
TCTCTGCACC 



AGACACAGGG CGTATGGAAA 
GATGCTTACC TACCACGGCC 
TGTGACTTAC AAACCTTGTT 
CCCTTGGCTG TGGCCCTCTG 
GGAATCCTCT GCTCCTCCCA 



Seq ID NO: 111 P 



ACCACGGCAA 
CCGGCCTGGA 
TGTTACGCTG 
CAACGGGTCA 

ATCCTGCCTG 
AGAGGCCGAG 
CAGCCCTTTG 
GGAAAATACA 
CCCTCCAGGC 
AACTGCACCA 
ACCCCTCTGG 
CTGGGGTTTG 
TTCTCTGAGC 
CCAAACTCTA 
TAACAGGATT 
AGCACGTCCT 
GTCTCCACCA 
TAAAAGCTGC 
TGTATGCCTG 
AATAAATTCA 



GATGCTGGAG 
TCTGCTGTCC 
GTTCACCTTC 



TGCGGCCGCG TGGCCATTGG 
GTGCTGGTGG ACGTGCAGGA 
CTGCCCATCG TGAACGTGCA 
CCCTGTACGC 



GTGCCATTCG GGGGAGTCGG 



CCAGGGCTGC 



GAGATCCGCT 
TCCCAGAGCT 
CCTGAGTCTA 
CTCCCCCAGC 
AAAGCAAGGT 
ACATGCCAGG 



CCCTCTCGGT 
GTGCCCTGCC 
CTTTGCTCTC 
GCACTGCCTC 
TTCACACCGC 
CATCACTCCA 
CTCAGTTTCC 
ATAAAATGGA 
TATCACCAAG ACACGCCTGC 



CAGGCCCAGT 
CATCAGCCCT 
ACACGCGCAC 



ACCCTGCACT 
CTGCACAGTG 
TTATGIGAAA 



CAAAGACTGT 
GAAAACCATC 
TTACATGGAC 
GGATCCTTCC 
TCTGTTC 



GCCAACTCCT 
TTCTGTCCTT 

AAGCACTCAT 



CACCCACAGC 
TTAGTGGGAC 
GTTGCTGGAA 
CACATAGAAG 
ATGTAAGACC 
ATGAGCTGCA 
GCGATCAGCT 
TAAAACGTTC 



1320 
1380 

1500 
1560 
1620 
1680 
1740 

1660 
1920 
1980 
2040 
2100 
2160 

2280 
2340 

2460 
2520 
2580 



I I I 

I APWNYPLNLT LVLLVGTLPA GNCWLKPSE 

ISQGTEKVLA EVLPQYLDQS CFAWLGGPQ ETGQLLEKKL DYIFFTGSPR VGKIVMTAAT 

KHLTPVTLEL GGKNPCYVDD NCDPQTVANR VAWFCYFNAG QTCVAPDYVL CSPEMQERLL 

PALQSTITRF YGDDPQSSPN LGRIINQKQF QRLRALLGCG RVAIGGQSNE SDRYIAPTVL 

VDVQETErVM QEEIFGPILP IVNVQSVDEA IKFINRQEKP LALYAFSNSR QWNQMLERT 

SSGSFGGNEG FTYISLLSVP FGGVGHSGMG RYHGKFTFDT F " 
RYPPYTDWKQ QLLRWGMGSQ SCTLL 

Seq ID NO: 112 DNA sequence 
Nucleic Acid Accession ft: NM_004456 
Coding sequence: 58-2298 



GAATTCCGGG CGACGCGCGG GAACAACGCG AGTCGGCGCG CGGGACGAAG 
TGAGAAGGGA 
GCTCAAGAGG 
AATTTTGGAA 



GGCCAGACTG GGAAGAAATC T 



TTTAGTTCCA A 



GAGTGTTCGG 
AATGCAGTTG 
GTGGAAGATG 
GATGGTACTT 
GAATGTGGGT 
AATGATGATG 



AAACTGTTTT 
TCATTGAAGA 
TTATAAATGA 
ACGATGATGA 



CTTGGATTTT 
CATAATGTAT 
ACATAACATT 



CCAGTTTGTT 
TTCAGACGAG 
AGAACGGAAA 
ACTTCTGTGA 

TCTTGGTCTC 
CCTTATATGG 
AATTATGATG 
GTGGAGTTGG 
GATCCTGAAG 
AGCCGCCCAC 



AATAATCATG 
TGTAAAATCA 
AAAGAGTATG 
AGAATGGAAA 
CGGGACTAGG 



h GAATTTTATG 



GGAAAGTACA CGGGGATAGA 54 0 
TGAATGCCCT TGGTCAATAT 60 0 



AAGAACTCAC CGAACAGCAG CTCCCAGGCG CACTTCCTCC 
ATGGACCAAA TGCTAAATCT GTTCAGAGAG AGCAAAGCTT 
CATACGCTTT TCTGTAGGCG ATGTTTTAAA TATGACTGCT TCCTACATCC 
GAAGAACACA GAAACAGCTC 
GGAGGGAGCA AAGGAGTTTG 
CCCCACCAAA ACGTCCAGGA GGCCGCAGAA 
AGTAGCAGGC CCAGCACCCC CACCATTAAT GTGCTGGAAT CAAAGGATAC 
AGGGAAGCAG GGACTGAAAC GGGGGGAGAG AACAATGA' 
GATGAAACTT CGAGCTCCTC TGAAGCAAAT TCTCGGTG' 
CCAAATATTG AACCTCCTGA GAATGTGGAG TGGAGTGGTG CTGAAGCCTC 
GCACTTACTA TGACAATTTC TGTGCCATTG CTAGGTTAAT 
AGGTGTATGA GTTTAGAGTC AAAGAATCTA 
GCTGAGGATG TGGATACTCC TCCAAGGAAA AAGAAGAGGA 
CACTGCAGAA AGATACAGCT GAAAAAGGAC GGCTCCTCTA ACCATGTTTJ 



AAAGCAGAAA 
7CCTTCTGA7 
AGAACTAAAG 
TGAATGTACC 
ACACTCCTTT 



TGGGACCAAA 
"CCAGCTCCC 

CAACTATCAA 
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CCCTGTGATC ATCCACGGCA GCCTTGTGAC AGTTCGTGCC CTTGTGTGAT AGCACAAAA? 16 80 

TTTTGTGAAA AGTTTTGTCA ATGTAGTTCA GAGTGTCAAA ACCGCTTTCC GGGATGCCGC 174 0 

TGCAAAGCAC AGTGCAACAC CAAGCAGTGC CCGTGCTACC TGGCTGTCCG AGAGTGTGAC 1800 

CCTGACCTCT GTCTTACTTG TGGAGCCGCT GACCATTGGG ACAGTAAAAA TGTGTCCTGC I860 

AAGAACTGCA GTATTCAGCG GGGCTCCAAA AAGCATCTAT TGCTGGCACC ATCTGACGTG 192 0 

GCAGGCTGGG GGATTTTTAT CAAAGATCCT GTGCAGAAAA ATGAATTCAI CTCAGAATAC 1980 

TGTGGAGAGA TTATTTCTCA AGATGAAGCT GACAGAAGAG GGAAAGTGTA TGATAAATAC 2 040 

ATGTGCAGCT TTCTGTTCAA CTTGAACAAT GATTTTGTGG TGGATGCAAC CCGCAAGGGT 2100 

AACAAAATTC GTTTTGCAAA TCATTCGGTA AATCCAAACT GCTATGCAAA AGTTATGATG 2160 

GTTAACGGTG ATCACAGGAT AGGTATTTTT GCCAAGAGAG CCATCCAGAC TGGCGAAGAG 222 0 

CTGTTTGTTG ATTACAGATA CAGCCAGGCT GATGCCCTGA AGTATGTCGG CATCGAAAGA 22 80 

GAAATGGAAA TCCCTTGACA TCTGCTACCT CCTCCCCCTC CTCTGAAACA GCTGCCTTAG 2 34 0 

CTTCAGGAAC CTCGAGTACT GTGGGCAATT TAGAAAAAGA ACATGCAGTT TGAAATTCTG 2 400 

AATTTGCAAA GTACTGTAAG AATAATTTAT AGTAATGAGT TTAAAAATCA ACTTTTTATT 2 460 

GCCTTCTCAC CAGCTGCAAA GTGTTTTGTA CCAGTGAATT TTTGCAATAA TGCAGTATGG 2 52 0 
TACATTTTTC AACTTTGAAT AAAGAATACT TGAACTTGAA A 



1 11 21 31 41 51 

I I I I I I 

MGQTGKKSEK GPVCWRKRVK SEYMRLRQLK RPRRADEVKS MFSSNRQKIL ERTEILNQEW 
KQRRIQPVHI LTSVSSLRGT RECSVTSDLD FPTQVIPLKT LNAVASVPIK YSWSPLQQNF 
MVEDETVLHN IPYMGDEVLD QDGTFIEELI KNYDGKVKGD RECGFIMDSI FVELVNALGQ 
YNDDDDDDDG DDPEEREEKQ KDLEDHRDDK ESRPPRKFPS DKILEAISSM FPDKGTAEEL 
KEKYKELTEQ QLPGALPPEC TPNIDGPNAK SVQREQSLHS FHTLFCRRCF KYDCFLHPFH 
ATPNTYKRKN TETALDNKPC GPQCYOHLEG AKEFAAALTA ERIKTPPKRP GGRRRGRLPN 
NSSRPSTPTI NVLESKDTDS DREAGTETGG ENNDKEEEEK KDETSSSSSA NSRCQTPIKM 
KPNIEPPENV EWSGAEASMF RVLIGTYYDN FCAIARLIGT KTCRQVYEFR VKESSIIAPA 
PAEDVDTPPR KKKRKHRLKA AHCRKIQLKK DGSSMHVYNY QPCDHPRQPC DSSCPCVIAQ 
NFCEKFCQCS SECQNRFPGC RCKAQCNTKQ CPCYLAVREC DPDLCLTCGA ADHKDSKNVS 
CKNCSIQRGS KKHLLLAPSD VAGWGIFIKD PVQKNEFISE YCGEIISQDE ADRRGKVYDK 

YMCSFLFNLN NDFWDATRK GNKIRFANHS VNPNCYAKVM MVNGDHRIGI F 

ELFVDYRYSQ ADALKYVGIE REMEIP 

Seq ID NO: 114 



1 11 21 31 41 51 

I I I I I 1 

AG-'CTCCGGC GAGTTGTTGC CTGGGCTGGA CGTGGTTTTG TCTGCTGCGC CCGCTCTTCG 

CGCTCTCGTT TCATTTTCTG CAGCGCGCCA CGAGGATGGC CCACAAGCAG ATCTACTACT 

CGGACAAGTA CTTCGACGAA CACTACGAGT ACCGGCATGT TATGTTACCC AGAGAACTTT 

CCAAACAAGT ACCTAAAACT CATCTGATGT CTGAAGAGGA GTGGAGGAGA CTTGGTGTCC 

AACACAGTCT AGGCTGGGTT CATTACATGA TTCATGAGCC AGAACCACAT ATTCTTCTCT 

TTAGACGACC TCTTCCAAAA GATCAACAAA AATGAAGTTT ATCTGGGGAT CGTCAAATCT 

TTTTCAAATT TAATGTATAT GTGTATATAA GGTAGTATTC AGTGAATACT TGAGAAATGT 

ACAAATCTTT CATCCATACC TGTGCATGAG CTGTATTCTT CACAGCAACA GAGCTCAGTT 

AAATGCAACT GCAAGTAGGT TACTGTAAGA TGTTTAAGAT AAAAGTTCTT CCAGTCAGTT 

TTTCTCTTAA GTGCCTGTTT GAGTTTACTC AAACAGTTTA CT7TTGTTCA ATAAAGTTTG 
TATGTTGCAT TTAAAAAAAA AAAAAAA 

Seq ID NO: 115 Protein sequence: 
Protein Accession #: NP_001818 



MAHKQIYYSD KYFDEHYEYR HVMLPRELSK QVPKTHLMSE EEWRRLGVQQ SLGWVHYMIH 
EPEPHILLFR RPLPKDQQK 



I I I I I I 

r GGACTCTTGA GCCACCTCTG GGGGTGGAGT CTC 
r ATCGACGAAG CTTGGGTGGG GCTCTTAGCT GCTATGTGCA 
AGAGGTGTGT TCCAGGGAAA GCCCCTATCT CTCTGCAGAG GTCAAGTGAA AGCGACGGCC 
GCAGCCAACA GAGTTCAAAA TGCAGGCTTG GAAAGTACAG GGGGCTCTGT GGAGGATGGG 
AAGGACTGAT CCACATTCCC ACCAGGAAGT TTAGCAGAAC CCCCGCGTGC CAAC7GGACC 
CCTTGGAAGG ACCTGGCTCA GGCTGGACCA CCTCTTGAGA GGGAGGAGCT CTGGATTTGA 
TCAAGAATTC TTTGCTGAGC ATCGTGCCTC ATGCCTATAA TACCAACACT TTGGGAGGCC 
AGTGTGGGAG GATCTCTTGA GCCCAGGAGT TCAAGACTAG CCTGGGCAAC ACAGAGAGAA 
CCCATCTCTA AAATAATAAT AATAATAAAA TAAAAAATTA GCAGGGCATG GTGGCATGTG 
CCTGTAGTTC CAGCTACCCA GGAGGCTGAG GCAAGAGGAT GGCTGGAGCC TGGGATGTTG 
AGGCTGCAAT GAACTGTGAT TACCCCACTG CACTCCAGCC TGGGCAAAAG AGCGAGAGAA 
CCTGTCTCAA ATAATAATAA TAATAATAAT CTTATTTTGG AGAATAAAGA GACCTCTGGA 
TTTGAGGTGC CATTTGGGTA GAAAGAAAAG ACGTTTACAC O 
CCTGAAGGAG CAGAGGGATG CATCGCTGGA GGTGACCTAC A 
GACAGACCTT GTCCTTCTTC CTTGTGGAAA G 

CTCTTCCCCC TCCCTGTCCC AGGGAAC CAA AGGGCTTTCT ACCACACCCT T 
CCGCCTCCCA TGTCTGCTGT GCCTTTGTAC TCAGCAATTC TTGTTTGCTC CATTATCTTC 1020 
CAGCCGGATA CAGAGTGAAT AGTTAACCAC ACTTAGGTCA AATAGGATCT AAATTTTTGT 1080 
TCCTGCTCCG TGTAAAGAGG CCAGTGTTTG TGTGTTGCAA GCAGCCTTGG AATAGTAACT 1140 



232 
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CTTCTCATTT GTTTGGGATC 
AGTTCATCAG GCTCTCGGAC 
GTGAAGGCTC GTGTTCTCCA 
TTGGAAGGGC AAAAAATGAA 
ACAGTCTGCT GTGAAGACCT 
GCTTCATGAG AGACTGACAG 
CAGTGTGTGC TGATGACACA 
ACTCTGTAGC CAACATACAC 
CTTGTCCAAA TGCAGAGTCA 
TGGCTTTCTT TTTGCAAAAA 



TGGCCACCAA 
CTTAGGGCTG 
TCCTCAACTT 
CACTGTCGTT 
TCTCTCAAGT 
CTATCAGGGG 
TACACACCTG 



GTTCCAGAAT 
TTGGAGAAGG 
TCTTTGCTTC 
CATTGCAGCC 
G V A 1 rGGG 
TTGTGGCACT 
ACAATAGCTT 
ACCCTTTCTA 
TACTTCATTA 
TTTTGTATGT 



GATACACGGA 
, rCAGCAGI 
GATCATACAC 
GTGTTTTGTG 
AGTCCATGCC 
TAGTGAGGAC 
GAGTCTTCTC 
AATATCTATC 
TTATTTCCAA 
TGCAAAAAAA 



TCAGTGCAGA 
AGAACTGATG 
AAGAATACAT 
ACACAGATGC 
AGATCATGGT 

TGTTCCTTTT 
ATGGTTCATC 
GGCGAATAGT 
AAAAAAAAAA 



PCT/US02/12476 



CTTCTCTCCC GCGGCGCTGG 



GCCCGCGGCG CCGACCCTTC 
CCGCGGCTCC GGCCCTGGCC 
* CCTTAAGGAT 



CTGCCGGCTG 
CCGATGGCTC 



CCGCTGCTGT 
CTCCTCGACC 

CCGCCCCGTC 



ACTATGAAGG 



AGACTCCAAG 
GGAACTGTTC 



GGCGCTTTTC 
CTCAACCTCA 
GCCGTCACCG 



ATTCTGGATG 300 



TCATCTCTGG 
TATTCACTAT 



CTTGGAAACA C 



3 CTGAAGATGC TCCCTGGTTT 



TGGATAATAC 



TAGTAGCAGG 
TTGGCCTTAC 
GCAGTGGAAC 
GAGTAGGCAC 
CAGCTTTGCT 
GCTTTATGAG 
AGGTCAAAGT 
CAGATGAAGA 



TACTGTGCAC 
ATGTTCATTA 
TGTAGACAAA 
CATAGCAAAT 
AGAAAATGGA 



TTCAGGGGCC 



AAATCGTGCT 



TAATTGAAAG 
ATGACACAGA 
ATTTTCATAA 
TTTCCAGGCA 



AGGATCAGGA 



GATAAATGCT 
TAGAACCCCA 
AAGAAAAATC 
CTTGAAACCA 
TGCATCCCTT 
GCTCATCAGA 
AGATGAAGTG 
TCCATTTCCA 
TAAGGACTTT 
TAAAAAGCCA 
GAAGCTGATG 
TGTAGGTGTG 



GAATCTAAAA 
AAAGTAATAC 
CAGAACAGAG 
TTGGTTTTAC 
TTGAACCAAG 
AGCCAGTCTG 
GCTCATTCTT 
CGGAAAAGAA 
ATTGGGGATA 
GAGGAGGTTT 



TCAGAGAAGA 
TTCAGAAGGA 
GATTCAAGGT 



TGAAGAATTT 
AACTTGAGTG 
TCAGTGGTGG 
AACAAGTCAT 
TTGAAGAGGC 
TCTACAATGG 
TTAGCAAAAC 
CTTTTGTTAA 
TCCTTGCCCA 
GCAAAGCTGA 



TGGAGTTTTC 
AGAAGTTGTT 
TGTGGCACGT 
AGCACAGTTC 
CCTTTATGAT 
TATTCGAGAG 
AGTAGACTCA 



AACAACAACC 
GTTAAATATG 
GATTGCCAAT 
AGGTACTTTA 
ACTCATCAAA 
AGTAATAGAA 



CGTGCTATTA 
GATCCAGCAA 
ATGAATAAGG 
AACATTAGTG 
TTGCTTACAC 
TCTGGAAACA 
CACCCTGAAG 
ATAGCTGGAT 
ATCAAAGAGA 
ACAGTTTGTA 
ATTGATAATG 
CTTGGAATTC 
CTACCAATAT 
ACCACAAGTC 
GAAGTAATTG 
CGGCCTGATC 
ACCCATCACA 
CCTCTGAAAG 



CCTGAAACCA 
CATACCCTAT 
CAAATTACCA 



TGGCAATCAG AGTAATATGT GCTGAAGAAC 
ACAATATTTT GAAAATAGTA GCTGATTTTT 
TACAGAGAGT CAAAGCCTGC ACAACAGAAG 
GTCTGCATTC ACTGAATGCC TTCTTGCTGC 
GTCGTTCCTA CAGTTACGTG TGTGGAATCT 
TTATTTTTCT GGCTAGGCTT ATACCTCGCA 



TGATTTTGAC 



CTGGCAATGA 



TATTCGAACC 



AGTAATAAAC T 



TCGAATTATG 



GGGTGCTCAG TACTTTACGC CAAGCTGATT 
AGGGAGTCTG GGTATGCTGG GAAAATCAGC CAGATGCCGG 
TTTGATCGGG ACCCACTTCA AAAGCAGCCT TCATGCCAGA 
TTTATTACTA GTGACTTCAT GACTGGTATA CCTGCAACAC 
GAGGTGGTAT TAAAGATGGT CACTGAGATT AAGAAGATTC 
CATCAAAGCC CCCAGGAACT ACTGAGTGGG 



2040 
2220 



MALCNGDSKL 
AFAIKEQGPR 
KSVREDGVFN 
SKKLYGAQFH 
VLLSGGVDST 
HSFYNGTTTL 
EVFLAQGTLH 
ILGRELGLPE 
TLLQRVKACT 
ESLIFLARLI 
ESGYAGKISQ 
WLKMVTEIK 



ENAGGDLKDG 
AIIISGGENS 
ISVDNTCSLF 
PEVGLTENGK 
VCTALLNRAL 
PISDEDRTPR 



I 



I 



ELVSRHPFPG 



NQEQVIAVHI 
KRISKTLNMT 
ASGKAELIKT 
PGLAIRVICA 
ITSLHSLNAF 
VYIFGPPVKE 



Y DLTSKPPGTT EWE 



LDAGAQYGKV IDRRVRELFV Q 
PAIFTIGKPV L( 
LTHGDSVDKV ADGFKWARS GNIVAGIANE 
AGCSGTFTVQ NRELECIREI KERVGTSKVL 
DNGFMRKRES QSVEEALKKL GIQVKVINAA 
TSPEEKRKII GDTFVKIANE VIGEMNLKPE 
HHKDTELIRK LREEGKVIEP LKDFHKDEVR 
EEPYICKDFP ETNKILKIVA DFSASVKKPH 
LLPIKTVGVQ GDCRSYSYVC 
PPTDVTPTFL TTGVLSTLRQ 
CQRSWIRTF ITSDFMTGIP 



ADFEAHNILR 



Seq ID NO: 119 DNA s 
Nucleic Acid Accession #-. N 
Coding sequence: 27.. 1967 



ACTTGCGTCT CGCCCTCCGG CCAAGCATGG GGCTTCCCAG GCTGGTCTGC GCCTTCT"GC 
TCGCCGCCTG CTGCTGCTGT CCTCGCGTCG CGGGIGTGCC CGGAGAGGCT GAGCAGCCTG 
CGCCTGAGCT GGTGGAGGTG GAAGTGGGCA GCACAGCCCT TCTGAAGTGC GGCCTCTCCC 
AGTCCCAAGG CAACCTCAGC CATGTCGACT GGTTTTCTGT CCACAAGGAG AAGCGGACGC 



233 
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TCATCTTCCG TGTGCGCCAG 
TCAGCCTCCA GGACAGAGGG 
GCATCTTCTT GTGCCAGGGC 
TCTACAAAGC T CCGGAGGAG 
GTAAGGAGCC TGAGGAGGTC 
TCATCTGGTA CAAGAATGGC 
CGTCCCAGAC TGTGGAGTCG 
TGGTTAAAGA AGACAAAGAT 
GGAACCACAT GAAGGAGTCC 
TGTGGCTGGA AGTGGAGCCC 
GTTTGGCTGA TGGCAACCCT 
GGGAGGCAGA GGAAGAGACA 
AGGAACACAG TGGGCGCTAT 
TGAGTGAACC ACAGGAACTA 
CCCCTGAGAG ACAGGAAGGC 
ACCTCGAGTT CCAGTGGCTG 
TTCAGTTGCA TGACCTGAAA 
CCAGCATACC CGGCCTGAAC 
GGATGGCATT CAAGGAGAGG 
GTGAAGCGTC AGGGCACCCC 
AACAAGACCA AGATCCACAG 
TGTTGGAGAC AGGTGTTGAA 
TCTTCCTGGA GCTGGTCAAT 
TCAGCACTTC CACTGCCAGT 
TGCCGGAGCC GGAGAGCCGG 
TGGCGGTGCT GGGCGCTGTC 
GCTCAGGGAA GCAGGAGATC 
TTAAGTCAGA TAAGCTCCCA 
GGGCTCCGGG AGACCAGGGA 



GGCCAGGGCC 
GCTACTCTGG 
AAGCGCCCTC 
CCAAACATCC 
GCTACCTGTG 
CGGCCTCTGA 
AGTGGTTTGT 



TGGGGAGTAC GAGCAGCGGC 
AGTCACCCCC CAAGACGAGC 
GTACCGCATC CAGCTCCGCG 
CCTGGGCATC CCTGTGAACA 
CGGGTACCCC ATTCCTCAAG 
GAACCGGGTC CACA7TCAGT 
GAGTATTCTG AAGGCACAGC 
CAACTACCGG CTGCCCAGTG 



PCT/US02/12476 



GTGGGAATGC 



GAATGTCAGG 



AGCAGCCTCA 
AGAGAAGAGA 
CGGGAGGCAG 
CGCACACAGC 
AAGGTGTGGG 
CGGCCCACCA 



TGAAGGAAGG 
TCAGCATCAG 
ACGGGGTCCT 
CCTGGAACTT 
ATGTGTCTGA 



GGACCGCGTG 
CAAGCAGAAC 
GGTGCTGGAG 
GGACACCATG 
CGTCCGAGTG 



A TCGCTGCGTG G 



TGAAAGAGAA 
TCTCCTGGAA 
GCACCCTGAA 




GGGCAAAAAC 



AATCTGTCTT 
ACGGCAAGTG 
ACCCCGGAGC 



AAATACCTGG 
CAAAGGCTGG 
GCCTGCTCAT 



CTCCTGCCAG 
CTTCCACCAT 
GTTGAAGTGC 
GCAGTGTTGC 
TTTGGTCAGA 
GGTGGCTCAC 
AGGACGAGAC 
AAT T AGCTAG 
CTGAAGCAGG AGAATGGTAT GAATCCAGGA 



ACATTTTTTC 



GAAGAGATGG 
GAGAAATACA 
ACCATTCCCA 
AAGCCTCCTG 
AGGACCTCAC 
CGTTGAGTGA 
CTTGCAGAAC 
CAGCTGAGCT 
CCAGGTGCAC 
GCTGTTCACA 
TGCCACCACC 
AGCCAGGAAC 
GCCTGTAATC 
CATCCTGGCT 



CTCCCCTCAC 
TTGGCCCTGC 
AGCTCATCCC 



GATTGTGTGC 
GGGCAAGCTG 
GACCGAACTT 
GGGCAGCAGC 
GCATTAGCCC 
CACTCTTCTC 
CTGCACACCC 
AAGCCGCTTT 
AAGCAAGGAG 
TACACACA 



CCGTGCAGGC 
GTAGTTGAAG 
GGTGACAAGA 
CGAATCACTT 
TCAGCCAAAG 



CTCCTGCTCG 



CACTGCACTC CAGCCTGGCC 
ACGCGTACCT GCGGTGAGGA 
TCCCCGTGTT CACTTGCTCC 
GGGGAGCAGA CAAAGATGAG 



AGCTGGGCGC 
CATAGCCCTC 
GTCTACACTG 
ACCAAGCTCA 



CTGTGTGTAT 



TGTCCCAGAA 



AGAGATCAGG GGTTACCTCT 
CTACCCTACT TTTCAGCAGC 
TGTTAGCAGG 



CCAGCACTTT 
AACACGGTGA 
TGGCACCTAT 
GGTGGAGCTT 
GACTCCGTCT 
TGTTTTCGAG 
TTGATGGATC 
TCCTTCATGG 
GGGCCCCAAC 
CTGGCTAGAG 
ATGGTTTTGT 
TATATGAAAA 
TGCTTTTTTA 
AGGCACACAA 
AAATGGCTCA 



AGGACACACC 
AGAGCACCCC 
CCTCTTCAAA 
CCTTAAAAGA 



CGAGGAAAAA 
TTCAGGTGAA 
ACGTAAAACT 



TACAACCAAA 
GCTTCTGAGC 
AAAACGTCCC 
CTTCCTATCG TTTCCGTCCA CTT 



TATATATATA 
TTCTACATGG 
AACCGTTTCC 
AGCTCTACCA 
GCACGAAGGG 



TATGAAAAAT 
GTACCACAGG 
AGTTGGCAGC 
GAGCAGACAG 
CCTGGCAGGC 



GGGTAGCCTC TCTGAGCTGG TTTCCTGCCC . 



GGAGCCAGGC 2 400 



AGCGGCATCC 



TACGTGCCGG 



CTACTAAAAA 
ACTCGGAAGG 
GAGACCGTGC 
AAAAGAAAAG 
TTAGCCTCAA 
GAAAGGCAGC 
TATGGTTATA 



2460 
2S2C 
2580 
2640 
270C 
27SC 



3240 
3300 
3360 

3 42 0 

3480 
3540 



70 
75 



TLIFRVRQGQ 
RVYKAPEEPN 
QSSQTVESSG 
KVWLEVEPVG 
RKEHSGRYEC 
QDLEFQWLRE 
PWMAFKERKV 
ELLETGVECT 
KLPEPESRGV 
EVKSDKLPEE 



DNGVLVLEPA 
LTLTCEAESS 
QLVKLAIFGP 
LSTLNVLVTP 
TRANSTSTER 
PPSRKTELW 

Seq ID NO: 121 DNA sequence 
Nucleic Acid Accession ft: NM_0 
Coding sequence: 60-571 



VAGVPGEAEQ 
GQSEPGEYEQ 
IQVNPLGIPV 
LYTLQSILKA 
MLKEGDRVEI 
QAWNLDTMIS 
ETDQVLERGP 
WVKENMVLNL 
ASNDLGKNTS 
VIVAVIVCIL 
MGLLQGSSGD 



I 



GSTALLKCGL 
RLSLQDRGAT LALTQVTPQD 
NSKEPEEVAT CVGRNGYPIP 
QLVKEDKDAQ 
RCLADGNPPP 
LLSEPQELLV 
VLQLHDLKRE 
SCEASGHPRP TI SWKVNGTA 



ERIFLCQGKR 
QVIWYKNGRP 
SGNHMKESRE 



KRAPGDQGEK YIDLRH 



AAPERQEGSS 
VPSIPGLNRT 
SEQDQDPQRV 
GLSTSTASPH 
RRSGKQEITL 



I 



I 



I 



I 



ATAGTCTACA CAGAGCTCCC CTTGCTGCCC AGACAAGCTG AAGGACCACA GGAAAAGCCA 
TGGAGACTTC AGCATCCTCC TCCCAGCCTC AGGACAACAG TCAAG-CCAC AGAGAAACAG 
AAGATGTAGA CTATGGAGAG ACAGATTTCC ACAAGCAAGA CGGGAAGGCT GGACTCTTTT 
A ATATGAGAGA AACAAGTCTT CTTCCTCCTC CTTCTCTTCC TCCTCATCCT 
ATCTTC TTCATCCTCC TCCTCCTCAG GTCCTGGGCA TGGGGAGCCT GAOJTTTTGA 
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GACTGAATAT 
GGGCCTTGCT 
TGCTCACCTT 



AGACTGACTG 



ACTGCTCTCA 



AAAGAAAGAT 
GGTGTGTTAT 
CGCCTCCCTG 
CCAAGGCTTC 
AGGCCACTTC 
AGCCCACAAG 
ACAGAACACT 
CATTTGGGCA 
GGGCCTGAGG 



GATGAGTTTT TCCATTTCGT 
CACTATTACG CAGACTGGTT 
GAAACCGTTG GCATCTACTT 
ATCCCCCTCT TCCAGAAGTT 



GGCAGAGCAG CATTCTGAGA 
TTTCCTTCCA TGTGGTCTGA 
GTACTGCTGT GCAACCCAGC 
CTTCACAGTA CCTGGACCAG 



GGTACCCTCT GGGGAATCAG 
GGAGGCCTCT CAGTTAAGAA 
CCTCCTGTGC TTTGCCATCG 
CATGTCTCTT GGGGTCGGCC 
CGGACTAGTG TACCGTATCC 
TAGGCTGACA GGGT7CAGGA 
GCCCCAGTGT GACCACCACT 
GACGCACAGG AGACCAAGCC 
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TCAAATCCCA 



GAACCTGATG 
GGCAGGAAAA TGATCATCAG 
TCCTCGCACT TTGGGAGGCT 
CTGGGCAACA TAGTGAGACC 
TACATACCTG TACATACCTG 
TGAGCCCAGG AGTTCAGGGC 
CGACAGAGCA AGATCGTTTC 



AAACTAAATG GCAGCCAGGC 
CAGGCTAAGG GTGGCTTGAA 
CCCATCTCTA CAATTTTTTT 
CGGTTCCAGC TACTCAAGAG 
TGCAGTGAGG TACGATCAAG 
TCTAAAATT 



TGCAAGGATG GAAGGCAGAG 
CAGGAAGATT CTGGGAGGTC 
GCTGTGATCT GTCTGGGTTC 
AGGTGCTACC GAGCCATCCT 
TCGGGCTGGG GCCCCTTGGT 
TTTTTGCACC CAAAAAAAAA 
ATGGGGGCTC ACGACTGTAA 
GCTGAGAGTT CAAGACCAAC 
TTAATGACCA AATGTGGCGG 
GCTGAGGCAG GAGGACTGCT 
CCACTGCACT CCAGCOIGGG 



1030 
1140 
1200 
12S0 
1320 
1380 
1440 
1500 



SSSSSSSSSS GPGHGEPDVL KDELQLYGDA PGEWPSGES GLRRRGSDPA SGEVEASQLR 
RLNIKKDDEF PHFVLLCPAI GALLVCYHYY ADWFMSLGVG LLTFASLETV GIYFGLVYRI 
HSVLQGFIPL FQKFRLTGFR KTD 



Seq ID NO: 12 3 
Nucleic Acid 

1 11 

I 



sequence 

#: BC022542 
243. .896 



21 



I 



I 



ACTTGGTCCC AGCCGATAAA TCTGGGGCAG CGCGCGGTAG GAGCTGCGGG CGGCCAGGCC 
CCTTCCTGCG TCCGCACCTG GCCCCGCGCG CCCCTCTCGG GCGTCCGGCT TCCGGCGTCC 
TGGCGGCTCG GGTGGCGGCG GTTCGGGCGG CCGCCTGGCT GCTCCTCGGG GCGGCGACGG 
GGCTCACGCG CGGGCCCGCC ACGGCCTTCA CCGCCGCGCG CTCTGACGCC GGCATAAGGG 
CCATGTCTTC TGAAATTATT TTGAGGCAAG AAGTTTTGAA AGATGGTTTC CACAGAGACC 
TTTTAATCAA AGTGAAGTTT GGGGAAAGCA TTGAGGACTT GCACACGTGC CGTCTCTTAA 
TTAAACAGGA CATTCCTGCA GGACTTTATG TGGATCCGTA TGAGTTGGCT TCATTACGAG 
ACACAAACAT AACAGAGGCA GTGATGGTTT CAGAAAATTT TGATATAGAG GCCCCTAACT 
ATTTGTCCAA GGAGTCTGAA GTTCTCATTT ATGCCAGACG AGATTCACAG TGCATTGACT 
' i j rCAAGC CTTTTTGCCT GTGCACTGCC GCTATCATCG GCCGCACAGT GAAGATGGAG 
AAGCCTCGAT TGTGGTCAAT AACCCAGATT TGTTGATGTT TTGTGACCAA GAGTTCCCGA 
TTTTGAAATG CTGGGCTCAC TCAGAAGTGG CAGCCCCTTG TGCTTTGGAT AATGAGGATA 
TATGCCAATG GAACAAGATG AAGTATAAAT CAGTATATAA GAATGTGATT CTACAAGTTC 
CAGTGGGACT GACTGTACAT ACCTCTCTAG TATGTTCTGT GACTCTGCTC ATTACAATCC 
TGTGCTCTAC ATTGATCCTT GTAGCAGTTT TCAAATATGG CCATTTTTCC CTATAAGTTT 
TATGTAGTTA AATGCTTCCT AGAAACCTAA ATAAGATCTA TTAATTTCTG ACGAGAGGTG 
TTCTTCTAGA ATTAATTACT TTTATCTTTT GTCTTCATTT GTGGCCAAAA TTATGTTTAC 
TAGAGGAAAT TTGGGATCAT TCTCAGCTAA TTCCAAAATG TAGTGCTCTA TTGCATGGAT 
CCTTGGTAAT CCTCAAGCAT CAGATGCCAT AAGGGGAAAC TTAATTCIGC TAAATTAATG 
TTTATTTTGT GAGAAGTCAC TTTATCTTCA TTTGGGGTAG AAAAATTATT TCTTTATGTA 
A ATTATTCTCA TTTTGCAAGT ACTTTCAATT TAAGCTACAA ATTGAGAAAA 
A TAAGAATAAA ATAGGCCAGG CACAGTGGCT CACACCTGTA ATCCCAGCAC 
C CGAGGTGGGC GGATCACCAG AGGTCAAGAG TTTGAGfl 
ACCCTGTCTC TACTAAAAAT ACAAAAGTTA GCTGGGGCTG GTGGTGGGCA T 
CAGCTAATTG GAAGGGTGAG GCGGGAGGAT CGCTTGAACC TGGGAGGCGG AGGTTCCAGA 
GAGCCAAGAT CGCACCACTG CACTACAGCC TGGGCGACAG AACGAGACCC 
GGAAAAACAA AAAAGAAGAA TAAAATAATT TGGATGAAAA TCATGTTTAT 
ATGTCATGAG ACTATTAAAG ATGTGCCAGA GTTTCAATGA AAATCATTAA AGTAGGACAG 
CTAAGAAATT AATATTAATA TAAAAATTAT TGATAATCTT AAATTATTGA TTATTCCTTA 
ACGCACTCCA TTCTCCTTTT ACATTTTATC ATGTTTCTTT TGAATATATG AATTGGCAAA 
GGACTTGATG AAACTGAGTA CTAAGATTTG GTACAGAGTA TGTCAGGAAG ACAACTCAGA 
GTACATGAAC AAAAAAAAAA AAAAAA 



MCSEIILRQE VLKDGFHRDL LIKVKFGESI EDLHTCRLLI KQDIPAGLYV DPYELASLRE 

RNITEAVMVS ENFDIEAPNY LSKESEVLIY ARRDSQCIDC FQAFLPVHCR YKRPHSEDGE 

ASIWNNPDL LMFCDQAGSR RMIRFRFDSF DKTIEFPILK CWAHSEVAAP CALENEDICQ 

WNKMKYKSVY KNVILQVPVG LTVHTSLVCS VTLLITILCS KKKKK 
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I 



WO 02/086443 
11 



I 



AGACACCTCT GCCCTCACCA 
GGGCTGCTGC TTTGCTGCCC 
CCTGAGAACC AATCTCACCG 
CACTCGGGTG GCAGAGATGC 
CCAGAAGCAA CTGTCCCTGC 
GCGAACCCCA CGGTGCGGGG 
CAAGTGGCAC CACCACAACA 
GGCGGTGATT GACGACGCCT 
CACCTTCACT CGCGTGTACA 
GCACGGAGAC GGGTATCCCT 
TGGCCCCGGC ATTCAGGGAG 
GGGCGTCGTG GTTCCAACTC 
CATCTTCGAG GGCCGCTCCT 
CTGGTGCAGT ACCACGGCCA 
GAGACTCTAC ACCCGGGACG 



TGAGCCTCTG G 



GTGGAGAGTC GAAATCTCTG 



GTCCTGGTGC TCCTGGTGCT 
CTTGTGCTCT TCCCTGGAGA 
TACCTGTACC GCTATGGTTA 
GGGCCTGCGC TGCTGCTTCT 
AGCGCCACGC TGAAGGCCAT 



TCCCAGACCT 
TCACCTATTG 
TTGCCCGCGC 

GCCGGGACGC AGACATCGTC ATCCAGTTTG GTGTCGCGGA 



Z TACTCGGAAG ACTTGCCGCG 



ACGCCCATTT C 



ACTCTGCCTG 
ACTACGACAC 
GCAATGCTGA 



CGACGACCGG 




AGGTCCCCCC 
TGCCTGCAAC 
CAAGGATGGG 
CCTTATCGCC 
GCTCTCCAAG 



CTCATGTACC 
GGCATCCGGC 
ACACCGCAGC 
TCAGAGCGCC 
ACTGCTGGCC 
GTGAACATCT 
AAGTACTGGC 
GACAAGTGGC 
AAGCTTTTCT 
CCGAGGCGTC 



ACCGGGACAA GCTCTTCGGC 
ACTCGGCGGG GGAGCTGTGC 
GTACCAGCGA GGGCCGCGGA 
GCGACAAGAA GTGGGGCTTC 
ATGAGTTCGG CCACGCGCTG 
CTATGTACCG CTTCACTGAG 
ACCTCTATGG TCCTCGCCCT 
CCACGGCTCC CCCGACGGTC 



GAGTTGTGGT CCCTGGGCAA 
GCGGCCTGCC ACTTCCCCTT 
ACGGCTTGCC 
GCCCCAGCGA 
TGCCAGTTTC CATTCATCTT 
TCCGACGGCT ACCGCTGGTG 
TTCTGCCCGA CCCGAGCTGA 
GTCT7CCCC7 TCACTTTCCT 



CCAGGACCGC TTCTACTGGC G 



CTTCTACGGC CACTACTGTG 
TCGACGCCAT CGCGGAGATT 
GATTCTCTGA 
CCGCGCTGCC 
TCTTCTCTGG GCGCCAGGTG 
TGGACAAGCT GGGCCTGGGA 
GGGGGAAGAT GCTGCTGTTC 
TGGTGGATCC CCGGAGCGCC 
CGCACGAGGT CTTCCAGTAC 



GAACCTGAGC CACGGCCTCC 
TGCCCCACCG GACCCCCCAC 
CCCCCCTCAG CTGGCCCCAC 



GGGAACCAGC 
AGCCGGCCGC AGGGCCCCTT 



GCAGTGCCAT 
CAAACTGGTA 
TCACCTTTGT 



ACCTATGACA 
GTAAATCCCC 
TTCTGTTCTG 



AGCGAGGTGG 
CGAGAGAAAG 
TTGAACCAGG 
TAGGGCTCCC 



CAGGCGCGTC 
CCCAGGTGAC 
GCCTCTGGAG 
ACCGGATGTT 
CCTATTTCTG 



TTGCCGGATA 

CCCTCTCTTC 

TTTTTGTTGG AGTGTTTCTA ATAAACTTGG ATTCTCTAAC CTTT 
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55 
60 
65 
70 
75 
80 
85 



MSLWQPLVLV 
RGESKSLGPA 
ITYWIQNYSE 
FDGKDGLLAH 
YSACTTDGRS 
ACTTDGRSDG 
CTSEGRGDGR 
PMYRFTEGPP 
PTAGPTGPPS 



LLVLGCCFAA 
LLLLQKQLSL 
DLPRAVIDDA 
AFPPGPGIQG DAHFDDDELW 



DRQLAEEYLY 
VPDLGRFQTF 
SRDADIVIQF 
RFGNADGAAC 



RYGYTRVAEN 



3 SDKKWGFCPD 



THDVFQYREK 



GCAGAAATAG 



AGPTGPPTAG 
OGPFLIADKW 
AQVTGALRSG 
AYFCQDRFYK 



PSTATTVPLS 
PALPRKLDSV 
RGKMLLFSGR 
RVSSRSELNQ 



NSAGELCVFP 
QGYSLFLVAA HEFGHALGLD 
PRPPTTTTPQ PTAPPTVCPT 
PVDDACNVNI FDAIAEIGNQ 
FEEPLSKKLF FFSGRQVWVY 
RLWRFDVKAQ MVDPRSASEV 
VDQVGYVTYD ILQCPED 



CCTAGGGAGA 
CAGTGGCGCT 
CCTGCCTGCG 
AAGCAGATTG 



TCAACCCCGA 
TCGTGGACGT 
CGCTGCTGCT 
AAGAGCTGAA 



GATGCTGAAC 



GGTGCCAGCG 
CTTCAGGAAA 
CATGAAGCAG 
TAATCAAGAC 
AGAGAAAATG TCCCCTGAAG ACAGAGCAAA 



2 AAACTGGGAT TTGAGGATGG 



CAGAGAATTC 
GGCAGCCTAA 
AATATATACC 
TGTTCTGCAG 
ACAGCTGTCC 



ACCGAGCGTG 



CCCCATGCAG 



AAGACCTTGG 



ACTGGGCCAT 
ATCCGATATC 
ATGTGGTTAT 



TGGATGGCCA 
GTTCAGAGGA 
AGCAAGGAGA 
GAGGGACTTT 
TCTAAAATGC 
CCCTCAGCCA 



GCTGTTTCCC 
GGGACAAGAA 
CACAATCGGA 
ATCAGTTCTG 
ATGCTTTGAA 
ATGTCGGGTA 
CCTCTATGAA 
CACCCTGCTG 
AGTCCGCTTC 
GCTGATTTCC 
TTCAGTACTT 
CACCCAGGCA 
GCTTCAGATG 
AATGGCTACT 



AAAGTGCTGT 
GAAGAGGAGT 
CTCACGGCCC 
GTTAGTCCTA 
CTTATTCACG 
AAACAGITTC 
AAGAATGAGG 
GATGACAAGG 
CTTGATGGAC 
AAGGACGCTG 



CTCTGGGCTC 



CCTCTTCCCT 



CTTAAGCACA 



CCATACAGGC 
TGAATTTCCA 
GAATGCCTTT 
CCAAGGTGTG 
CT CTCTGCAA 
TCAACATGAA 
GCTGTTCTTC 
AGCAGAGTGC 
CTCCCCAGTG 
TCTGTAAGTT 
TAGC 
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I I 1 I I I I 

MLNKVLSRLG VAGQWR F VDV LGLEEESLGS VPAPACALLL LFPLTAQHSN FRKKQIEELK 60 

GQEVSPKVYF MKQTIGNSCG TIGLIHAVAN NQDKLGFEDG SVLKQFLSET EKMSPEDRAK 12 0 

5 CFEKNEAIQA AHDAVAQEGQ CRVDDKVNFH FILFNNVDGH LYELDGRMPF PVNHGA3SED 180 

TLLKDAAKVC REFTEREQGE VRFSAVALCK AA 



Seq ID NO: 129 DNA sequence 
11) Nucleic Acid Accession # : NM_000213 
Coding sequence: 127-5385 

1 11 21 31 41 51 

15 CGCCCGCGCG CTGCAGCCCC ATCTCCTAGC GGCAGCCCAG GCGCGGAGGG AGCGAGTCCG SO 

CCCCGAGGTA GGTCCAGGAC GGGCGCACAG CAGCAGCCGA GGCTGGCCGG GAGAGGGAGG 12 0 

AAGAGGATGG CAGGGCCACG CCCCAGCCCA TGGGCCAGGC TGCTCCTGGC AGCCTTGATC 180 

AGCGTCAGCC TCTCTGGGAC CTTGGCAAAC CGCTGCAAGA AGGCCCCAGT GAAGAGCTGC 240 

ACGGAGTGTG TCCGTGTGGA TAAGGACTGC GCCTACTGCA CAGACGAGAT GTTCAGGGAC 300 

20 CGGCGCTGCA ACACCCAGGC GGAGCTGCTG GCCGCGGGCT GCCAGCGGGA GAGCATCGTG 360 

GTCATGGAGA GCAGCTTCCA AATCACAGAG GAGACCCAGA TTGACACCAC CCTGCGGCGC 42 0 

AQCCAGATGT CCCCCCAAGG CCTGCGGGTC CGTCTGCGGC CCGGTGAGGA GCGGCATTTT 480 

GAGCTGGAGG TGTTTGAGCC ACTGGAGAGC CCCGTGGACC TGTACATCCT CATGGACTTC 540 

TCCAACTCCA TGTCCGATGA TCTGGACAAC CTCAAGAAGA TGGGGCAGAA CCTGGCTCGG 600 

25 GTCCTGAGCC AGCTCACCAG CGACTACACT ATTGGATTTG GCAAGTTTGT GGACAAAGTC 660 

AGCGTCCCGC AGACGGACAT GAGGCCTGAG AAGCTGAAGG AGCCCTGGCC CAACAGTGAC 72 0 

CCCCCCTTCT CCTTCAAGAA CGTCATCAGC CTGACAGAAG ATGTGGATGA GTTCCGGAAT 780 

AAACTGCAGG GAGAGCGGAT CTCAGGCAAC CTGGATGCTC CTGAGGGCGG CTTCGATGCC 840 

ATCCTGCAGA CAGCTGTGTG CACGAGGGAC ATTGGCTGGC GCCCGGACAG CACCCACCTG 900 

30 CTGGTCTTCT CCACCGAGTC AGCCTTCCAC TATGAGGCTG ATGGCGCCAA CGTGCTGGCT 960 

GGCATCATGA GCCGCAACGA TGAACGGTGC CACCTGGACA CCAGGGGCAC CTACACCCAG 1020 

TACAGGACAC AGGACTACCC GTCGGTGCCC ACCCTGGTGC GCCTGCTCGC CAAGCACAAC 1030 

ATCATCCCCA TCTTTGCTGT CACCAACTAC TCCTATAGCT ACTACGAGAA GCTTCACACC 1140 

TATTTCCCTG TCTCCTCACT GGGGGTGCTG CAGGAGGACT CGTCCAACAT CGTGGAGCTG 12 00 

35 CTGGAGGAGG CCTTCAATCG GATCCGCTCC AACCTGGACA TCCGGGCCCT AGACAGCCCC 12 60 

CGAGGCCTTC GGACAGAGGT CACCTCCAAG ATGTTCCAGA AGACGAGGAC TGGGTCCTTT 132 0 

CACATCCGGC GGGGGGAAGT GGGTATATAC CAGGTGCAGC TGCGGGCCCT 7GAGCACGTG 1380 

GATGGGACGC ACGTGTGCCA GCTGCCGGAG GACCAGAAGG GCAACATCCA TCTGAAACCT 1440 

TCCTTCTCCG ACGGCCTCAA GATGGACGCG GGCATCATCT GTGATGTGTG CACCTGCGAG 1500 

40 CIGCAAAAAG AGGTGCGGTC AGCTCGCTGC AGCTTCAACG GAGACTTCGT GTGCGGACAG 15S0 

TGTGTGTGCA GCGAGGGCTG GAGTGGCCAG ACCTGCAACT GCTCCACCGG CTCTCTGAGT 1620 

GACATTCAGC CCTGCCTGCG GGAGGGCGAG GACAAGCCGT GCTCCGGCCG TGGGGAGTGC 1680 

CAGTGCGGGC ACTGTGTGTG CTACGGCGAA GGCCGCTACG AGGGTCAGTT CTGCGAGTAT 1740 

GACAACTTCC AGTGTCCCCG CACTTCCGGG TTCCTCTGCA ATGACCGAGG ACGCTGCTCC 1800 

45 ATGGGCCAGT GTGTGTGTGA GCCTGGTTGG ACAGGCCCAA GCTGTGACTG TCCCCTCAGC I860 

AATGCCACCT GCATCGACAG CAATGGGGGC ATCTGTAATG GACGTGGCCA CTGTGAGTGT 192 0 

GGCCGCTGCC ACTGCCACCA GCAGTCGCTC TACACGGACA CCATCTGCGA GATCAACTAC 19 80 

TCGGCGATCC ACCCGGGCCT CTGCGAGGAC CTACGCTCCT GCGTGCAGTG CCAGGCGTGG 2 040 

GGCACCGGCG AGAAGAAGGG GCGCACCTGT GAGGAATGCA ACTTCAAGGT CAAGATGGTG 2100 

50 GACGAGCTTA AGAGAGCCGA GGAGGTGGTG GTGCGCTGCT CCTTCCGGGA CGAGGATGAC 2160 

GACTGCACCT ACAGCTACAC CATGGAAGGT GACGGCGCCC CTGGGCCCAA CAGCACTGTC 2220 

CTGGTGCACA AGAAGAAGGA CTGCCCTCCG GGCTCCTTCT GCTGGCTCAT CCCCCTGCTC 2280 

CTCCTCCTCC TGCCGCTCCT GGCCCTGCTA CTGCTGCTAT GCTGGAAGTA CTGTGCCTGC 2340 

TGCAAGGCCT GCCTGGCACT TCTCCCGTGC TGCAACCGAG GTCACATGGT GGGCTTTAAG 2 400 

55 GAAGACCACT ACATGCTGCG GGAGAACCTG ATGGCCTCTG ACCACTTGGA CACGCCCATG 2460 

CTGCGCAGCG GGAACCTCAA GGGCCGTGAC GTGGTCCGCT GGAAGGTCAC CAACAACATG 2520 

CAGCGGCCTG GCTTTGCCAC TCATGCCGCC AGCATCAACC CCACAGAGCT GGTGCCCTAC 2580 

GGGCTGTCCT TGCGCCTGGC CCGCCTTTGC ACCGAGAACC TGCTGAAGCC TGACACTCGG 2 640 

GAGTGCGCCC AGCIGCGCCA GGAGGTGGAG GAGAACCTGA ACGAGGTCTA CAGGCAGATC 2 700 

60 TCCGGTGTAC ACAAGCTCCA GCAGACCAAG TTCCGGCAGC AGCCCAATGC CGGGAAAAAG 2 760 

CAAGACCACA CCATTGTGGA CACAGTGCTG ATGGCGCCCC GCTCGGCCAA GCCGGCCCTG 2 820 

C1GAAGCTTA CAGAGAAGCA GGTGGAACAG AGGGCCTTCC ACGACCTCAA GGTGGCCCCC 2880 

GGCTACTACA CCCTCACTGC AGACCAGGAC GCCCGGGGCA TGGTGGAGTT CCAGGAGGGC 2940 

GTGGAGCTGG TGGACGTACG GGTGCCCCTC TTTATCCGGC CTGAGGATGA CGACGAGAAG 3 0 00 

65 CAGCTGCTGG TGGAGGCCAT CGACGTGCCC GCAGGCACTG CCACCCTCGG CCGCCGCCTG 3 060 

GTAAACATCA CCATCATCAA GGAGCAAGCC AGAGACGTGG TGTCCTTTGA GCAGCCTGAG 3120 

TTCTCGGTCA GCCGCGGGGA CCAGGTGGCC CGCATCCCTG TCATCCGGCG TGTCCTGGAC 3180 

GGCGGGAAGT CCCAGGTCTC CTACCGCACA CAGGATGGCA CCGCGCAGGG CAACCGGGAC 3240 

TACATCCCCG TGGAGGGTGA GCTGCTGTTC CAGCCTGGGG AGGCCTGGAA AGAGCTGCAG 3300 

70 GTGAAGCTCC TGGAGCTGCA AGAAGTTGAC TCCCTCCTGC GGGGCCGCCA GGTCCGCCGT 33 60 

TTCCACGTCC AGCTCAGCAA CCCTAAGTTT GGGGCCCACC TGGGCCAGCC CCACTCCACC 3420 

ACCATCATCA TCAGGGACCC AGATGAACTG GACCGGAGCT TCACGAGTCA GATGTTGTCA 3480 

TCACAGCCAC CCCCTCACGG CGACCTGGGC GCCCCGCAGA ACCCCAATGC TAAGGCCGCT 3540 

GGGTCCAGGA AGATCCATTT CAACTGGCTG CCCCCTTCTG GCAAGCCAAT GGGGTACAGG 3600 

75 GTAAAGTACT GGATTCAGGG TGACTCCGAA TCCGAAGCCC ACCTGCTCGA CAGCAAGGTG 3660 

CCCTCAGTGG AGCTCACCAA CCTGTACCCG TATTGCGACT ATGAGATGAA GGTGTGCGCC 3720 

TACGGGGCTC AGGGCGAGGG ACCCTACAGC TCCCTGGTGT CCTGCCGCAC CCACCAGGAA 3780 

GTGCCCAGCG AGCCAGGGCG TCTGGCCTTC AATGTCGTCT CCTCCACGGT GACCCAGCTG 3840 

AGCTGGGCTG AGCCGGCTGA GACCAACGGT GAGATCACAG CCTACGAGGT CTGCTATGGC 3900 

80 CTGGTCAACG ATGACAACCG ACCTATTGGG CCCATGAAGA AAGTGCTGGT TGACAACCCT 3 960 

AAGAACCGGA TGCTGCTTAT TGAGAACCTT CGGGAGTCCC AGCCCTACCG CTACACGGTG 4020 

AAGGCGCGCA ACGGGGCCGG CTGGGGGCCT GAGCGGGAGG CCATCATCAA CCTGGCCACC 4080 

CAGCCCAAGA GGCCCATGTC CATCCCCATC ATCCCTGACA TCCCTATCGT GGACGCCCAG 4140 

AGCGGGGAGG ACTACGACAG CTTCCTTATG TACAGCGATG ACGTTCTACG CTCTCCATCG 4200 

85 GGCAGCCAGA GGCCCAGCGT CTCCGATGAC ACTGAGCACC TGGTGAATGG CCGGATGGAC 4260 

TTTGCCTTCC CGGGCAGCAC CAACTCCCTG CACAGGATGA CCACGACCAG TGCTGCTGCC 4320 

TATGGCACCC ACCTGAGCCC ACACGTGCCC CACCGCGTGC TAAGCACATC CTCCACCCTC 4380 
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ACACGGGACT ACAACTCACT 
GACTACTCCA CCCTCACCTC 
GACACGCCCA CCCGCCTGGT 
CAGGAGCCGC GGTGCGAGCG 
GGCGGTGAGC TGCATCGGCT 
GACCTCCTGC CCAACCACTC 



TGTCCCCTGC CAGGCTCCGC C 
TTCACTGCCC TGAGCCCAGA C 
GGGGATATCG TCGGCTACCT G 
GCATTCCGGG TGGATGGAGA C 
AACGTGCCCT ACAAGTTCAA G 
GAGGGCATCA TCACCATAGA G 
GCCGGGCTCT TCCAGCACCC G 



GTTCTCTGCC 
GCCGCTGCAG 
CAACATCCCC 
CTACGTGTTC 
CATCACCATT 



A GAACACTCAC 

CTGGGGCCCA 
GGCTACAGTG 
AACCCTGCCC 



ACTCGACCAC ACTGCCGAGG 
GCCTGACTGC TGGTGTGCCC 
CATCTCTCAG AGTGAGCTGG 
TGGAGTACCA GCTGCTGAAC 
AGACCTCGGT GGTGGTGGAA 
CCCAGAGCCA GGAAGGCTGG 
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A GTGCCCCAGG CCCGCTGGTG 
3 AGCGGCCACG GAGGCCCAAT 
C AAGGAGGAGG GCCAGCCACC 
A CCGTGCCGGG CCTCAGCGAG 
3 AGGGCTTCGG GCCAGAGCGC 



GGCGGCTCCC TCACCCGGCA TGTGACCCAG GAGTTTGTGA 
GGAACCCTTA GCACCCACAT GGACCAACAG TTCTTCCAAA 
CCCCCGCCAT GTCCCACTAG GCGTCCTCCC GACTCCTCTC 
CCATCCTTGC ACCCCTGGGG GCCCAGCCCA CCCGCATGCA 
TCCTGGGAGG CATGAAGGGG GCAAGGTCCG TCCTCTGTGG 
AAAGAGCTGG GAGCAGCACA AGGACCCAGC CTTTGTTCTG CACTTAATAA ATGGTTTTGC 
ACTG 



3CATCACCAC CACCCACACC 
GGGCCCAGCA CCTGGAGGCA 
GCCGGACACT GACCACCAGC 
CTTGACCGCA CCCTGCCCCA 
CCGGAGCCTC CTCAGCTACT 
CAGAGCAGGG GCTAGGTGTC 
GCCCAAACCT ATTTGTAACC 



5040 
5100 
5160 
5220 
5280 
5340 
5400 
54 60 
5520 
5580 
5640 



QTAVCTRDIG 
TQDYPSVPTL 
EAFNRIRSNL 
THVCQLPEDQ 
CSEGWSGQTC 
FQCPRTSGFL 
CHCHQQSLYT 
LKRAEEVWR 
LLPLLALLLL 
SGNLKGREIVV 
AQLRQEVEEN 
LTEKQVEQRA 
LVEAIDVPAG 
KSQVSYRTQD 
VQLSNPKFGA 



RLLLAALISV 
GCQRESIWM 
DLYILMDPSN 
KEPHPNSDPP 



VRLLAKHNII 
DIRALDSPRG 
KGNIHLKPSF 
NCSTGSLSDI 
CNDRGRCSMG 
DTICEINYSA 



21 
I 

SLSGTLANRC 
ESSFQITEET 
SMSDDLDNLK 
FSFKNVISLT 
FSTESAFHYE 
PIFAVTNYSY 
LRTEVTSKMF 



LCWKYCACCK 
RWKVTNNMQR 
LNEVYRQISG 
FHDLKVAPGY 



QPCLRE 
QCVCEPGWTG 
IHPGLCEDLR 
TYSYTMEGDG 
ACLALLPCCN 
PGFATHAASI 
VHKLQQTKFR 
YTLTADODAR 
ITI IKEQARD 
PVEGELLFQP 
IIRDPDELDR 



QIDTTLRRSQ 
KMGQNLARVL 
EDVDEFRNKL 
ADGANVLAGI 
SYYEKLHTYF 
QKTRTGSFHI 
ICDVCTCELQ 



MSPQGLRVRL 
SQLTSDYTIG 
QGERISGNLD 



CTDEMFRDRR 



K PCSGRGECQC 
PSCDCPLSNA 
SCVQCQAWGT 
APGPNSTVLV 



PVSSLGVLQE 
RRGEVGIYQV 
KEVRSARCSF 



AQGEGPYSSL 
KRPMSIPIIP 



GMVEFQEGVE 
WSFEQPEFS 
GEAWKELQVK 
SFTSQMLSSQ 

SGKPMGYRVK YWIQGDSESE AHLLDSKVPS 
VSSTVTQLSW 
SQPYRYTVKA 
DDVLRSPSGS 



STLTSVSSHD 

LPGSAFTLST 
RVDGDSPESR 
LFQHPbQSEY 



VSCRTHQEVP 
KKVLVDNPKN 
DIPIVDAQSG 
MTTTSAAAYG 
SRLTAGVPDT 
AQTSVWEDL 
PSAPGPLVFT 
LTVPGLSENV 
SSITTTHTSA 
QT 



NPTELVPYGL SLRLARLCTE 
HTIVDTVLMA 
LVDVRVPLFI 
VSRGDQVARI 
LLELQEVDSL 
PPPHGDLGAP 
VELTNLYPYC 
AEPAETNGEI 



TCIDSNGGIC NGRGHCECGR 
CNFKVKMVDE 
FWWLIPLLLL 
SDilLDTPMLR 
NLLKPDTREC 
PRSAKPALLK 



PVIRRVLDGG 



SEPGRLAFMV 
RMLLIENLRE 
EDYDSFLMYS 
THLSPHVPHR 
PTRLVFSALG 
LPNHSYVFRV 
ALSPDSLQLS 
PYKFKVQART 
TEPFIjVDGPT LGAQHLEAGG 



IVGYLVTCEM 
IITIESQDGG 
SLTRHVTQEF 



QNPNAKAAGS 
DYSKKVCAYG 
TAYEVCYGLV 
EAIINLATQP 
HLVNGRMDFA 
SHSTTLPRDY 
SVEYQLLNGG 
QVHPQSPLCP 
AQGGGPATAF 
PFPQLGSRAG 
VSRTLTTSGT 



1 DNA sequence 



CCTCGTGCCG CGGACCCCAG CCTCTGCCAG 



CTCCGGACAC 
TGAGCCTGGC 
AGAAAAATGG 
ATAGCACCTT 
GCAGGTATGG 
GTGCAGCAAA 
ATTGCTTCAA 
ATGCCTTTGA 

ACGTGAGCAG 
CCTTTTCTAC 
ACAGAATCCC 
AAGAAAATGA 
ATGAAGATTT 



CAACCACAAG 
ACCCAGAAGC 
ATTCTACAAG 



CATGGACAAG 
GCAGATCGAT 
TCGCTACAGC 
GCCCACAATG 
GTTCATAGAA 
CAACACAGGG 
TGCTTCAGCT 
TGGACCAATT 
ATACAGAACG 
CGGCTCCTCC 
TGTACACCCC 
TGCTACCAGT 
AGATGAAAGA 
TATCTCCAGC 
CTGGACCCAG 




ATCTCTCGGA 



GTGTACATCC TCACATCCAA CACCTCCCAG 



ACACCCTCCC 
CACAATCCAG 
GTTTGGCAAC 



ACCATAACTA 
AATCCTGAAG 
AGTGAAAGGA 
ATCCCAGACG 
ACGTCTTCAA 
GACAGACACC 
ACCATTTCAA 
TGGAACCCAA 
GTAGACAGAA 
CTCATTCACC 
GCAACTCCTA 
AGATGGCATG 



AAGGCTTTCA 
TTTGAGACCT 
AACTCCATCT 
TATGACACAT 
GACCTGCCCA 
CGCTATGTCC 



GCAGCACTTC 
AAGACAGTCC 
ATACCATCTC 
TCAGTTTTTC 
CCACACCACG 



A AG A 

CTGGATCACC 
AGCAGGCTGG 
TGGATCAGGC 
GGCTTTTGAC 



ATGGCACCAC TGCTTATGAA 
ATGAGCATCA TGAGGAAGAA 
GTAGTACAAC GGAAGAAACA 
AGGGATATCG CCAAACACCC 



GACAGCACAG 
GAGCCAAATG 
A"TGATGATG 
CACACAAAAC 
CTACTTCAGA 
GGAAACTGGA 
GAGACCCCAC 
GCTACCCAGA 
AGAGAAGACT 



1080 
1260 
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CCCATTCGAC AACAGGGACA GCTGCAGCCT CAGCTCATAC CAGCCATCCA ATGCAAGGAA 132 0 

GGACAACACC AAGCCCAGAG GACAGTTCCT GGACTGATTT CTTCAACCCA ATCTCACACC 1380 

CCATGGGACG AGGTCATCAA GCAGGAAGAA GGATGGATAT GGACTCCAGT CATAGTACAA 1440 

CGCTTCAGCC TACTGCAAAT CCAAACACAG GTTTGGTGGA AGATTTGGAC AGGACAGGAC 1500 

5 CTCTTTCAAT GACAACGCAG CAGAGTAATT CTCAGAGCTT CTCTACATCA CATGAAGGCT 1560 

TGGAAGAAGA TAAAGACCAT CCAACAACTT CTACTCTGAC ATCAAGCAAT AGGAATGATG 1620 

TCACAGGTGG AAGAAGAGAC CCAAATCATT CTGAAGGCTC AACTACTTTA CTGGAAGGTT 1680 

ATACCTCTCA TTACCCACAC ACGAAGGAAA GCAGGACCTT CATCCCAGTG ACCTCAGCTA 1740 

AGACTGGGTC CTTTGGAGTT ACTGCAGTTA CTGTTGGAGA TTCCAACTCT AATGTCAATC 1800 

10 GTTCCTTATC AGGAGACCAA GACACATTCC ACCCCAGTGG GGGGTCCCAT ACCACTCATG 1860 

GATCTGAATC AGATGGACAC TCACATGGGA GTCAAGAAGG TGGAGCAAAC ACAACCTCTG 1920 

GTCCTATAAG GACACCCCAA ATTCCAGAAT GGCTGATCAT CITGGCATCC CTCTTGGCCT 1980 

TGGCTTTGAT TCTTGCAGTT TGCATTGCAG TCAACAGTCG AAGAAGGTGT GGGCAGAAGA 2040 

AAAAGCTAGT GATCAACAGT GGCAATGGAG CTGTGGAGGA CAGAAAGCCA AGTGGACTCA 2100 

15 ACGGAGAGGC CAGCAAGTCT CAGGAAATGG TGCATTTGGT GAACAAGGAG TCGTCAGAAA 2160 

CTCCAGACCA GTTTATGACA GCTGATGAGA CAAGGAACCT GCAGAATG7G GACATGAAGA 222 0 

TTGGGGTGTA ACACCTACAC CATTATCTTG GAAAGAAACA ACCGTTGGAA ACATAACCAT 2280 

JACACTT AACAGATGCA ATGTGCTACT GAT7GTTTCA TTGCGAATCT 2340 
T AAAATTTTCT ACTCTTAAAA AAAAAAAAAA AAAAAAA 



I I I I I I 

MDKFWWHAAW GLCLVPLSLA QIDLNITCRF AGVFHVEKNG RYSISRTEAA DLC 
PTMAQMEKAL SIGFETCRYG FIEGHWIPR IHPNSICAAN NTGVYILTSN TSQYDTYCFN 
30 ASAPPEEDCT SVTDLPNAFD GPITITIVNR DGTRYVQKGE YRTNPEDIYP SNPTDDDVSS 
GSSSERSSTS GGYIFYTFST VHPIPDEDSP WITDSTDRIP ATSTSSNTIS AGWEPNEENE 
DERDRHLSFS GSGIDDDEDF ISSTISTTPR AFDHTKQNQD HTQWNPSHSN PEVLLQTTTR 
MTDVDRNGTT AYEGNWNPEA HPPLIHHEHH EEEETPHSTS TIQATPSSTT EETATQKEQW 
FGNRWHEGYR QTPREDSHST TGTAAASAHT SHPMQGRTTP SPEDSSKTDF FNPISHPMGR 
35 GHQAGRRMDM DSSHSTTLQP TANPNTGLVE DLDRTGPLSM TTQQSNSQSF STSHEGLEED 
KDHPTTSTLT SSNRNDVTGG RRDPNHSEGS TTLLEGYTSH YPKTKESRTF IPVTSAKTGS 
FGVTAVTVGD SNSNVNRSLS GDQDTFHPSG GSHTTHGSES DGHSHGSQEG GANTTSGPIR 
TPQIEEWLII LASLLAIALI LAVCIAVIISR RRCGQKKKLV INSGNGAVED RKPSGLNGEA 
SKSQEMVHLV NKESSETPDQ FMTADETRNL QNVDMKIGV 



NO: 133 DNA sequence 
= Acid Accession (t •. NM_002882 
: 150-755 



40 
45 

1 11 21 31 41 51 

I I I I I 1 

CGAGGTTCGG GTCGTGGGGC GGAGGGAAGA GCGGGCGGGC GGGAGGCGCC GGCGCCAGAC 
GCGGAGGGAA GGAGCTACGA GTAGCCGCCG AGAGGCCGCG GAGCCAGCGA CGACCGACCC 

50 AGCCGAGCCG CCGCCGCCGC CGCGCCCCCA TGGCGGCCGC CAAGGACACT CATGAGGACC 

ATGATACTTC CACTGAGAAT ACAGACGAGT CCAACCATGA CCCTCAGTTT GAGCCAATAG 
TTTCTCTTCC TGAGCAAGAA ATTAAAACAC TGGAAGAAGA TGAAGAGGAA CTTTTTAAAA 
TGCGGGCAAA ACTGTTCCGA TTTGCCTCTG AGAACGATCT CCCAGAATGG AAGGAGCGAG 
GCACTGGTGA CGTCAAGCTC CTGAAGCACA AGGAGAAAGG GGCCATCCGC CTCCTCATGC 

55 GGAGGGACAA GACCCTGAAG ATCTGTGCCA ACCACTACAT CACGCCGATG ATGGAGCTGA 
AGCCCAACGC AGGTAGCGAC CGTGCCTGGG TCTGGAACAC CCACGCTGAC TTCGCCGACG 
AGTGCCCCAA GCCAGAGC1G CTGGCCATCC GCTTCCTGAA TGCTGAGAAT GCACAGAAAT 
TCAAAACAAA GTTTGAAGAA TGCAGGAAAG AGATCGAAGA GAGAGAAAAG AAAGCAGGAT 
CAGGCAAAAA TGATCATGCC GAAAAAGTGG CGGAAAAGCT AGAAGCTCTC TCGGTGAAGG 

60 AGGAGACCAA GGAGGATGCT GAGGAGAAGC AATAAATCGT CTTATTTTAT TTTCTTTTCC 
TCTCTTTCCT TTCCTTTTTT TAAAAAATTT TACCCTGCCC CTCTTTTTCG GTTTG7TTTT 
ATTCTTTCAT TTTTACAAGG GACGTTATAT AAAGAACTGA ACTC 

65 



MAAAKDTHED HDTSTENTDE SNHDPQFEPI VSLPEQEIKT 
ENDLPEKKER GTGDVKLLKH KEKGAIRLLM RRDKTLKICA NHYITPMMEL KPNAGSDRAW 
VWNTHADFAD ECPKPELLAI RFLNAENAQK FKTKFEECRK EIESREKKAG SGKNDHAEKV 
AEKLEALSVK EETKEDAEEK Q 



GGATTTGAGG GACAGGGTCG GAGGGGGCTC TTCCGCCAGC ACCGGAGGAA GAAAGAGGAG 
GGGCTGGCTG GTCACCAGAG GGTGGGGCGG ACCGCGTGCG CTCGGCGGCT GCGGAGAGGG 
GGAGAGCAGG CAGCGGGCGG CGGGGAGCAG CATGGA3CCG GCGGCGGGGA GCAGCATGGA 
GCCTTCGGCT GACTGGCTGG CCACGGCCGC GGCCCGGGGT CGGGTAGAGG AGGTGCGGGC 
GCTGCTGGAG GCGGGGGCGC TGCCCAACGC ACCGAATAGT TACGGTCGGA GGCCGATCCA 
G ATGGGCAGCG CCCGAGTGGC GGAGCTGCTG CTGCTCCACG GCGCGGAGCC 
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CAACTGCGCC GACCCCGCCA CTCTCACCCG ACCCGTGCAC GACGCTGCCC GGGAGGGCTT 54 0 

CCTGGACACG CTGGTGGTGC TGCACCGGGC CGGGGCGCGG CTGGACG7GC GCGATGCCTG 600 

GGGCCGTCTG CCCGTGGACC TGGCTGAGGA GCTGGGCCAT CGCGATGTCG CACGGTACCT 660 

GCGCGCGGCT GCGGGGGGCA CCAGAGGCAG TAACCATGCC CGCATAGATG CCGCGGAAGG 72 0 

TCCCTCAGAC ATCCCCGATT GAAAGAACCA GAGAGGCTC? GAGAAACCTC GGGAAACTTA 780 

GATCATCAGT CACCGAAGGT CCTACAGGGC CACAACTGCC CCCGCCACAA CCCACCCCGC 84 0 

TTTCGTAGTT TTCATTTAGA AAATAGAGCT TTTAAAAATG TCCTGCCTTT TAACGTAGAT 900 

ATATGCCTTC CCCCACTACC GTAAATGTCC ATTTATATCA TTTTTTATAT ATTCTTATAA 960 

AAATCTAAAA AAGAAAAACA CCGCTTCTGC CTTTTCACTG TGT-GGAGTI 11 ['GGA rG 1020 

•AGCACTCACG CCCTAAGCGC ACATTCATGT GGGCATTTCT TGCGAGCCTC GCAGCCTCCG 1080 

GAAGCTGTCG ACTTCATGAC AAGCATTTTG TGAACTAGGG AAGCTCAGGG GGGTTACTGG 114 0 

CTTCTCTTGA GTCACACTGC TAGCAAATGG CAGAACCAAA GCTCAAATAA AAATAAAATA 120 0 
ATTTTCATTC ATTCACTC 



MEPAAGSSME PSADWLATAA ARGRVEEVRA LLEAGALPNA PNSYGRRPIQ VMMMGSARVA 
ELLLLHGAEP NCADPATLTR PVHDAAREGF LDTLWLHRA GARLDVRDAW GRLPVDLAEE 
LGHRDVARYL RAAAGGTRGS NHARIDAAEG PSDIPD 



TGTGTGGGGG TCTGCTTGGC GGTGAGGGGG CTCTACACAA GCTTCCTTTC CGTCATGCCG 
GCCCCCACCC TGGCTCTGAC CATTCTGTTC TCTCTGGCAG GTCATGATGA TGGGCAGCGC 
CCGAGTGGCG GAGCTGCTGC TGCTCCACGG CGCGGAGCCC AACTGCGCCG ACCCCGCCAC 
TCTCACCCGA CCCGTGCACG ACGCTGCCCG GGAGGGCTTC CTGGACACGC TGGTGGTGCT 
GCACCGGGCC GGGGCGCGGC TGGACGTGCG CGATGCCTGG GGCCGTCTGC CCGTGGACCT 
GGCTGAGGAG CTGGGCCATC GCGATGTCGC ACGGTACCTG CGCGCGGCTG CGGGGGGCAC 
CAGAGGCAGT AACCATGCCC GCATAGATGC CGCGGAAGGT CCCTCAGACA TCCCCGATTG 
AAAGAACCAG AGAGGCTCTG AGAAACCTCG GGAAACTTAG ATCATCAGTC ACCGAAGGTC 
CTACAGGGCC ACAACTGCCC CCGCCACAAC CCACCCCGCT TTCGTAGTTT TCATTTAGAA 
AATAGAGCTT TTAAAAATGT CCTGCCTTTT AACGTAGATA TAAGCCTTCC CCCACTACCG 
TAAATGTCCA TTTATATCAT TTTTTATATA TTCTTATAAA AATGTAAAAA AGAAAAACAC 
CGCTTCTGCC TTTTCACTGT GTTGGAGTTT TCTGGAGTGA GCACTCACGC CCTAAGCGCA 
CATTCATGTC GGCATTTCTT GCGAGCCTCG CAGCCTCCGG AAGCTGTCGA CTTCATGACA 
GG GGTTACTGGC TTCTCTTGAG TCACACTGCT 



I I I I I I 

MMMGSARVAE LLLLHGAEPN CADPATLTRP VHDAAREGFL DTLWLHRAG ARLDVRDAWG 
L GHRDVARYLR AAAGGTRGSN HARIDAAEGP SDIPD 

9 DNA sequence 



I I I I I i 

CCCAACCTGG GGCGACTTCA GGTGTGCCAC ATTCGCTAAG TGCTCGGAGT TAATAGCACC 60 
TCCTCCGAGC ACTCGCTCAC GGCGTCCCCT TGCCTGGAAA GATACCGCGG TCCCTCCAGA 120 
GGATTTGAGG GACAGGGTCG GAGGGGGCTC TTCCGCCAGC ACCGGAGGAA GAAAGAGGAG 180 
GGGCTGGCTG GTCACCAGAG GGTGGGGCGG ACCGCGTGCG CTCGGCGGCT GCGGAGAGGG 2 40 
GGAGAGCAGG CAGCGGGCGG CGGGGAGCAG CATGGAGCCG GCGGCGGGGA GCAGCATGGA 300 
GCCGGCGGCG GGGAGCAGCA TGGAGCCTTC GGCTGACTGG CTGGCCACGG CCGCGGCCCG 360 
GGGTCGGGTA GAGGAGGTGC GGGCGCTGCT GGAGGCGGGG GCGCTGCCCA ACGCACCGAA 420 
TAGTTACGGT CGGAGGCCGA TCCAGGTGGG TAGAAGGTCT GCAGCGGGAG CAGGGGATGG 4 80 
CGGGCGACTC TGGAGGACGA AGTTTGCAGG GGAATTGGAA TCAGGTAGCG CTTCGATTCT 540 
CCGGAAAAAG GGGAGGCTTC CTGGGGAGTT TTCAGAAGGG GTTTGTAATC ACAGACCTCC 600 
TCCTGGCGAC GCCCTGGGGG CTTGGGAAAC CAAGGAAGAG GAATGAGGAG CCACGCGCGT 660 
ACAGATCTCT CGAATGCTGA GAAGATCTGA AGGGGGGAAC ATATITGTAT TAGATGGAAG 720 
TCATGATGAT GGGCAGCGCC CGAGTGGCGG AGCTGCTGCT GCTCCACGGC GCGGAGCCCA 780 
ACTGCGCCGA CCCCGCCACT CTCACCCGAC CCGTGCACGA CGCTGCCCGG GAGGGCTTCC 840 
TGGACACGCT GGTGGTGCTG CACCGGGCCG GGGCGCGGCT GGACGTGCGC GATGCCTGGG 900 
GCCGTCTGCC CGTGGACCTG GCTGAGGAGC TGGGCCATCG CGATGTCGCA CGGTACCTGC 960 

GCGCGGCTGC GGGGGGCACC AGAGGCAGTA ACCATGCCCG CATAGATGCC GCGGAAGGTC 1020 

CCTCAGACAT CCCCGATTGA AAGAACCAGA GAGGCTCTGA GAAACCTCGG GAACTTAGAT 1080 

CATCAGTCAC CGAAGGTCCT ACAGGGCCAC AACTGCCCCC GCCACAACCC ACCCCGCTTT 1140 

CGTAGTTTTC ATTTAGAAAA TAGAGCTTTT AAAAATGTCC TGCCTTTTAA CGTAGATATA 1200 

TGCCTTCCCC CACTACCGTA AATGTCCATT TATATCATTT TTTATATATT CTTATAAAAA 12 60 

TGTAAAAAAG AAAAACACCG CTTCTGCCTT TTCACTGTGT TGGAGT"TTC TGGAGTGAGC 1320 

ACTCACGCCC TAAGCGCACA TTCATGTGGG CATTTCTTGC GAGCCTCGCA GCCTCCGGAA 1380 

GCTGTCGACT TCATGACAAG CATTTTGTGA ACTAGGGAAG CTCAGGGGGG TTACTGGCTT 1440 

CTCTTGAGTC ACACTGCTAG CAAATGGCAG AACCAAAGCT CAAATAAAAA TAAAATAATT 1500 



240 
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RRSAAGAGDG 



PAAGSSMEPS ADWLATAAAR GRVEEVRALL EAGALPNAPN SYGRRPIQVG 
ELESGSASIL RKKGRLPGEF SEGVCNHRPP PGDALGAWET 



20 
25 



C CCGCGTGCGC 
3 GTCCGGGTGG GAGTGGGGGT GGGGTGGGGG 
Z AGGGAAGGCG GGTGCGCGCC TGCGGGGCGG 



TGGGGGTGAA 



GCCGTTCCGA 



GGCGGCGCAG CGGCTGCCGA G 
TTGGTGACCC 

CACATCCCGC GGCTCACGGG GGAGTGGGCA GCGCCAGGGG 
GTGCTGATGC TACTGAGGAG CCAGCGTCTA GGGCAGCAGC 
CATGATGATG GGCAGCGCCC GAGTGGCGGA GCTGCTGCTG 
CCCGCCACTC 
GTGGTGCTGC 
GTGGACCTGG 



GCTCACCTCT 



GGGGCGGTGC 180 



GGACACGCTG 



CGCGGCTGCG 
CTCAGACATC 
CATCAGTCAC 



GCTGTCGACT 
CTCTTGAGTC 
TTCATTCATT 



CCCGATTGAA 
CGAAGGTCCT 
ATTTAGAAAA 
CACTACCGTA 
AAAAACACCG 
TAAGCGCACA 
TCATGACAAG 
ACACTGCTAG 
CACTC 

; 142 Proteir 



ACCGGGCCGG 
CTGAGGAGCT 
GAGGCAGTAA 



ACAGGGCCAC 
TAGAGCTTTT 
AATGTCCATT 
CTTCTGCCTT 
TTCATGTGGG 
CATTTTGTGA 
CAAATGGCAG 



GGCGCGGCTG 
GGGCCATCGC 
CCATGCCCGC 
AGGCTCTGAG 
AACTGCCCCC 
AAAAATGTCC 



GAGTGAGGGT 
CGCCCGCCGC 
CGCTTCCTAG 
CTCCACGGCG 
GCTGCCCGGG 
GACGTGCGCG 



GCGCAGGTTC 
TTTCGTGGTT 
TGTGGCCCTC 



TTCACTGTGT 
CATTTCTTGC 
ACTAGGGAAG 
AACCAAAGCT 



ATAGATGCCG 
AAACCTCGGG 
GCCACAACCC 
TGCCTTTTAA 

TTTATATATT 
TGGAGTTTTC 
GAGCCTCGCA 
CTCAGGGGGG 
CAAATAAAAA 



CGGAGCCCAA 
AGGGCTTCCT 
ATGCCTGGGG 
GGTACCTGCG 
CGGAAGGTCC 



TGGAGTGAGC 
GCCTCCGGAA 
TTACTGGCTT 
TAAAATAATT 



MCRCRCVCPS LQLRGQEWRC SPLVPKGGAA AAELGPGGGE NMVRRFLVTL RIRRACGPPR 
VRVFWHIPR LTGEWAAPGA PAAVALVLML LRSQRLGQQP LPRRPGHDDG QRPSGGAAAA 
PRRGAQLRRP RHSHPTRARR CPGGLPGHAG GAAPGRGAAG RARCLGPSAR GPG 

Seq ID NO: 143 DNA sequence 
Nucleic Acid A 
Coding sequen 



I 

GAAATTGCAC 
GATAAAGAGA 
AATGCTTATC 
GCCAGATATA 
AGGAGGGAGC 
TCTGCTGCAA 
CAGACTGTGG 



ACTTAAAGAC 



AACTCACAGA 
GTACTACCGC 
AGGTGTTGAA 
CCTCACGAAT 
CTCCAAACTG 
ATGCTCTGGA 
AAGGACTTTT 



ATCAGTGGAT 
TTTGGAGAAA 
GAAGGACAAA 
ATTGCTTGAA 



GTGGGAAAGG AAAGCTGACT 
ATTCGAGTCC TTGAGGCTGA GAAGGAGAAG 
GAAATACAGC GACTGAGAGA CCAACTGAAG 
CAGCTGGAAG AGACAACGAG AGAAGGAGAA 
GAAGAGAAAG ACGTATTGAA ACAACAGTTG 
GAAAGCAAAA CCAATACACT CCGTTTATCA 
TCAATAAATA ATATTCATGA AATGGAAATA 



AGCAAAGATC 



CAAACCATAA 
CAAAAAGAAG 
CATCTGGAAG 
ATTGCTAGGG 
CAGTCTCTTT 
GAACAACAGA 
GTGCAGCATC 
GTTGGAATCC 



GTTACAACGA 
CTCAGCTGAG 
TTCACAATTT 
ATGATAGGCA 
GAAAACTTGA 
ACACATCTCT 
TGCAGGCATG 
AATTGCATGT 
TTGAAACAGC 
GAAAACAGAG 
CTGGTGGAAT 
GTCCATGTGG 
CAATACTGTA 



CCTGAATCAG A 



TTTTGAACTG 
AAATCAGCTG 
TAAAACAGAG 



GCTGTGCATT 
ACAGAATGTG 
GTGAGGAAAA 
TTATGTTTCG 
CATTTTTTAT 



GCTAAAGCAG 
TACTTTAGAC 
AATTCTTAAG 
TTCATGAGTT 
AAAAAGTTGC 
GTCCCAAGTG 
AATACTGTTC 
TTTTCTGTTA 
ATCTACCTTT 
TCTCTTGGCA 



AGTGAATTTC 
TTGTATTCAC 
AAGATACAAA 
AAGAGATCCG 
CAAGAAGAAC 
TTTGAAAATG 
GAGCTCCGAA 
K3CCATCACA 
CGCCTCACCA 
CAATATACAG 
AAAGTAGCAA 



AAACAAGGGT 
AAAAACTCGA 
AAGCAAGAAA 
GAGCCATTAG 



GACACTCCAG 



GATAGCTCAG 
GGGGTTTTGA 
CCAAGCACTT 
GATACTATTT 



TCTACTGAGA 
GTTATTGCTA 
AAAATCAAAG 
AGAAAACCTA 
TTTTTTCATA 



TATCCAGCCA 
AATAAGTATT 
ATTTTGAATT 
CATGCTAGTG 
CCTGACATGG 
TACTAACATT 
ATGGGTTAAT 



GGAAACAGCT 
TCAAGAAGAG 
GGTTGAACGA 
IGAAGAAACC 
AGATGTGCAA 
AGAGAATGAT 
ATCTCAGGTC 
AGCTCTGTTG 
CCGT CAACAT 
AAATAACACA 
TCACTTTCCA 
CTGCTGCACT 
CTGAGCAl 33 
TGTTTTGATA 
ATATATTTCA 
AATCATGTAT 
TTCATCATCA 
TTGCACTGTC 



241 
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AAGTTGGGGA TTTTCTTGAT CTTTATTGCT GCTTACCATT GAAACTTAAC CCAGCTGTGT 1860 

TCCCCAACTC TGTTCTGCGC ACGAAACAGT ATCTGTTTGA GGCATAATCT TAAGTGGCCA 1920 

CACACAATGT TTTCTCTTAT GTTATCTGGC AGTAACIGTA ACTTGAATTA CATTAGCACA 1980 

TTCTGCTTAG CTAAAATTGT TAAAATAAAC TTTAATAAAC CCAIGTAGCC CTCTCATTTG 2 040 

ATTGACAGTA TTTTAGTTAT TTTTGGCATT CTTAAAGCTG GGCAATGTAA TGATCAGATC 2100 

TTTGTTTGTC TGAACAGGTA TTTTTATACA TGCTTTTTGT AAACCAAAAA CTTTTAAATT 2160 

TCTTCAGGTT TTCTAACATG CTTACCACTG GGCTACTGTA AATGAGAAAA GAATAAAATT 2220 



PCT/US02/12476 



MEIQLKDALE KNQQWLVYDQ QREVYVKGLL AKIFELEKKT E7AAHSLPCQ TKKPE3EGYL 
QEEKQKCYND LtASAKKDLE VERQTITQLS FELSEFRRKY EETQKEVHNL NQLLYSQRRA 
DVQHLEDDRH KTEKIQKLRE ENDIARGKLE EEKKRSEELL SQVQSLYTSL LKQQEEQTRV 
ALLEQQMQAC TLDFENEKLD RQHVQHQLHV ILKELRKARK NHTVGILETA S 



Coding sequence 



I 

CCGCCAGATT 
GACGTTGCCC 
CTGGCCCTTC 
CCACTGCCCC 
GGAAGGCTGG 
CGCTTTCCTT 
GGACAGAGAA 
TGAGGAAACT 
CCTCTGGCCG 
GTGCCACCAG 



ACAGTTTTTT 



I 

- TGAATCGCGG 
CCTGCCTGGC 
TTGGAGGGCT 
ACTGAGAACG 
GAGCCAGATG 
TGTGTCAAGA 
AGAGCCAAGA 
GCGAAGAAAG 
GAGCTGCCTG 
CCTTCCTGTG 
GTTTCAACTG 
GCGGGTGCTG 
TTTTGCTGTT 
CCCTTTTGCT 
CTGGACCTCA 
GAATCTGAGC- 
TGTTGTTGTG 



I 



I 



I 



AGCCCTTTCT 
GCGCCTGCAC 
AGCCAGACTT 



CAGAGGTGGC GGCGGCGGCA TGGGTGCCCC 
CAAGGACCAC CGCATCTCTA CATTCAAGAA 
CCCGGAGCGG ATGGCCGAGG CTGGCTTCAT 
GGCCCAGTGT TTCTTCTGCT TCAAGGAGCT 
AGAGGAACAT AAAAAGCATT CGTCCGGTTG 
AGAATTAACC CTTGGTGAAT TTTTGAAACT 
AAAGGAAACC AACAATAAGA AGAAAGAATT 



GTCCCAGAGT 
GGCCCCTTAG 
TGCTCCTGTT 



rrTCCAGGGT TTATTCCCTG S4 0 



TATTTTGTTT 
AGCCATTCTA 
AGTGATAGGA 
AGTGAGCCGC 

TCTGTCAGCC 
CCAGGTCCCC 



GCTGGAAACC 



GAATTGTTAA 
AGTCATTGGG 
AGCGTCTGGC 
GGGGCACATG 
GACTTGGCTG 
CAACCTTCAC 
GCTTTCTTTG 
ATTCGCCCTC 
TCTGGAGGTC 



TTGATTCCCG 
AGAGCTGACA 
TGTTGTTGAG 
TGCAGGTTCC 
TTTTTTTGTT 
GAGTCCCTGG 
TTCACAGAAT 
GAAACGGGGT 



GATGCTGTGG 
ATCTGTCACG 
GAGGCAGCAG 
CTCCCTGTCA 
ATCTCGGCTG 



CAATGTCTTA GGAAAGGAGA TCAACATTTT 
TTGTCTTGAA AGTGGCACCA GAGGTGCTTC 
TGGCTGCTTC TCTCTCTCTC TCTCTTTTTT 
GGCTTACCAG G7GAGAAGTG AGGGAGGAAG 
GCTTTGTTCG CGTGGGCAGA GCCTTCCACA 
GCTGTCACAG TCCTGAGTGT G 



CACCTGTGCC TCCTCAGAGG 960 



CTCCTCTACT 
AGCACAAACT 
GAACTTCAGG 
TTTGCCACTG 
CTCCCTCAGA 
GGGACTGGCT 
TTCTCCACAC 



GTTTAACAAC ATGGCTTTCT 
ACAATTAAAA CTAAGCACAA 
TGGATGAGGA GACAGAATAG 
CTGTGTGATT AGACAGGCCC 
AAAAGGCAGT GGCCTAAATC 



GGGGGAGAGA CGCAGTCCGC 
GCTGAAGTCT GGCGTAAGAT 
GGGTGGATTG TTACAGCTTC 
ATAAAAAGCC TGTCATTTC 



KKEFEETAKK VRRAIEQLAA MD 



Seq ID NO: 

Nucleic Aci 
Coding sequence: 127-7 



TTAGGTGGAG 
GAGAGGTACC 
ATTGATTCTG 
AGACCATCCC 



I 

i CTGGTACCCC 
AGGCAGCTCT 
AGAGAGCTTC 
TCACATGTTG 
CCAACACACC 
CATTTGAACC 
CTGGAAGGAT 
TCAACATCGC 



TTCCTCAAGA 
GAGGAAGAGA 
CAGAAAAGGA 



ATGCCAGACA 
TGCTTGATAA 
AGGCCAGTCA 
CIGGTTCATC 
TTTTTTTAAA 
ATTTTGAAAT 



ACGTCTGAAG 
GCAAGATAAA 
TTATGAGAAA 
TCCTCAGATC 
TTGTCTGGAT 
AACTGTGTTG 
GGCTGACATA 
GTGGACAGAG 
TCTACCAGAG 
GCTAGTAGGC 



CGATTTCTCA 
GTTCTCAAAT 
ACCTCTATTC 
TCCTCAGAAT 
AAGCATGCAA 
GCTGGTGACT 
ATAGAAAAGA 
TGTTCTTTGC 
GACATAATTT 
TGAAAAATAA 



A7GACCTGCG 
AGCTAGAAGT 
CTCCAATTTA 
TGCCACCAAA 
AGCTGCTCAT 
TTAAATATAA 



CCAGAGTACA 
AATTTCATCC 
CAAGG7GATC 
TTGTGTAGTT 
ATAGTCATTT 



AGCTCAAATA 
TATCATTCCT 
TCATCCAAAC 
AGGTGCTTGG 
GTCAGAACCC 
TAAGCCAGCC 
AAAGGCTGAT 
CAACTCAACA 
TGATGTTTAG 
TAAGTTGCCT 
TATTTATCTT 
AATGTTGAAA 
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MQRASRLKRE LHMLATEPPP GITCWQDKDQ MDDLRAQILG GANTPYEKGV FKLEVIIPER 
YPFEPPQIRF LTPIYHPNID SAGRICLDVL KLPPKGAWRP SLNIATVLTS IQLLMSEPNP 
DDPLMADISS EFKYNKPAFL KNARQWTEKH ARQKQKADEE EMLDNLPEAG DSRVHNSTQK 
RKASQLVGIE KKFHPDV 



120 

GCTCTCGCCG GCCACACGGA GCGGCGCCGG GGAGCTATGA GCCATGAAGC CGCCCGGCAG 240 

CAGCTCGCGG CAGCCGCCCC TGGCGGGCTG CAGCCTTGCC GGCGCTTCCT GCGGCCCCCA 300 

ACGCGGCCCC GCCGGCTCGG TGCCTGCCAG CGCCCCGGCC CGCACGCCGC CCTGCCGCCT 360 

GCTTCTCGTC CTTCTCCTGC TGCCTCCGCT CGCCGCCTCG TCCCGGCCCC GCGCCTGGGG 420 

GGCTGCTGCG CCCAGCGCTC CGCATTGGAA TGAAACTGCA GAAAAAAATT TGGGAGTCCT 480 

GGCAGATGAA GACAATACAT TGCAACAGAA TAGCAGCAGT AATATCAGTT ACAGCAATGC 540 

AATGCAGAAA GAAATCACAC TGCCTTCAAG ACTCATATAT TACATCAACC AAGACTCGGA SOO 

AAGCCCTTAT CACGTTCTTG ACACAAAGGC AAGACACCAG CAAAAACATA ATAAGGCTGT S60 

CCATCTGGCC CAGGCAAGCT TCCAGATTGA AGCCTTCGGC TCCAAATTCA TTCTTGACCT 720 

CATACTGAAC AATGGTTTGT TGTCTTCTGA TTATGTGGAG ATTCACTACG AAAATGGGAA 780 

ACCACAGTAC TCTAAGGGTG GAGAGCACTG TTACTACCAT GGAAGCATCA GAGGCGTCAA 840 

AGACTCCAAG GTGGCTCTGT CAACCTGCAA TGGACTTCAT GGCATGTTTG AAGAT GAT AC 900 

CTTCGTGTAT ATGATAGAGC CACTAGAGCT GGTTCATGAT GAGAAAAGCA CAGGTCGACC 960 

ACATATAATC CAGAAAACCT TGGCAGGACA GTATTCTAAG CAAATGAAGA ATCTCACTAT 1020 
GGAAAGAGGT GACCAGTGGC CCTTTCTCTC TGAATTJ 



TAATGATCAC AAAACGTATA AGAAGCATCG CTCTTCTCAT GCACATACCA ACAACTTTGC 1200 

AAAGTCGGTG GTCAACCTTG TGGATTCTAT TTACAAGGAG CAGCTCAACA CCAGGGTTGT 1260 

CCTGGTGGCT GTAGAGACCT GGACTGAGAA GGATCAGATT GACATCACCA CCAACCCTGT 1320 

GCAGATGCTC CATGAGTTCT CAAAATACCG GCAGCGCATT AAGCAGCATG CTGATGCTGT 1380 

GCACCTCATC TCGCGGGTGA CATTTCACTA TAAGAGAAGC AGTCTGAGTT ACT7TGGAGG 1440 

TGTCTGTTCT CGCACAAGAG GAGTTGGTGT GAATGAGTAT GGTCTTCCAA TGGCAGTGGC 1500 

ACAAGTATTA TCGCAGAGCC TGGCTCAAAA CCTTGGAATC CAATGGGAAC CTTCTAGCAG 1560 

AAAGCCAAAA TGTGACTGCA CAGAATCCTG GGGTGGCTGC ATCATGGAGG AAACAGGGGT 152 0 

GTCCCATTCT CGAAAATTTT CAAAGTGCAG CATTTTGGAG TATAGAGACT TTTTACA6A6 1680 

AGGAGGTGGA GCCTGCCTTT TCAACAGGCC AACAAAGCTA T7TGAGCCCA CGGAATGTGG 1740 

AAATGGATAC GTGGAAGCTG GGGAGGAGTG TGATTGTGGT T7TCATGTGG AATGCTA7GG 180 0 

ATTATGCTGT AAGAAATGTT CCCTCTCCAA CGGGGCTCAC TGCAGCGACG GGCCCTGCTG 1B60 

TAACAATACC TCATGTCTTT TTCAGCCACG AGGGTATGAA TGCCGGGATG CTGTGAACGA 1920 

r ACTGAATATT GTACTGGAGA CTCTGGTCAG TGCCCACCAA ATCTTCATAA 19B0 

3CATGCA ATCAAAATCA GGGCCGCTGC TACAATGGCG AGTGCAAGAC 2040 

CAGAGACAAC CAGTGTCAGT ACATCTGGGG AACAAAGGCT GCAGGGTCTG ACAAGTTCTG 2100 

CTATGAAAAG CTGAATACAG AAGGCACTGA GAAGGGAAAC TGCGGGAAGG ATGGAGACCG 2160 

GTGGATTCAG TGCAGCAAAC ATGATGTGTT CTGTGGATTC TTACTCTGTA CCAATCTTAC 2220 



AGGCCGGGTG ATTGACTGCA GTGGTGCCCA TGTAGTTTTA GATGATGATA CGGATGTGGG 2340 

CTATGTAGAA GATGGAACGC CATGTGGCCC GTCTATGATG TGTTTAGATC GGAAGTGCCT 2400 

ACAAATTCAA GCCCTAAATA TGAGCAGCTG TCCACTCGAT TCCAAGGGTA AAGTCTGTTC 2460 

GGGCCATGGG GTGTGTAGTA ATGAAGCCAC CTGCATTTGT GATTTCACCT GGGCAGGGAC 2520 

AGATTGCAGT ATCCGGGATC CAGTTAGGAA CCTTCACCCC CCCAAGGATG AAGGACCCAA 25B0 

GGGTCCTAGT GCCACCAATC TCATAATAGG CTCCATCGCT GGTGCCATCC TGGTAGCAGC 2640 

TATTGTCCTT GGGGGCACAG GCTGGGGATT TAAAAATGTC AAGAAGAGAA GGTTCGATCC 2700 

TACTCAGCAA GGCCCCATCT GAATCAGCTG CGCTGGATGG ACACCGCCTT GCACTGTTGG 2760 

ATTCTGGGTA TGACATACTC GCAGCAGTGT TACTGGAACT ATTAAGTTTG TAAACAAAAC 2820 

CTTTGGGTGG TAATGACTAC GGAGCTAAAG TTGGGGTGAC AAGGATGGGG TAAAAGAAAA 2 8 BO 

CTGTCTCTTT TGGAAATAAT GTCAAAGAAC ACCTTTCACC ACCTGTCAGT AAACGGGGGA 2940 

GGGGGCAAAA GACCATGCTA TAAAAAGAAC TGTTCCAGAA TCTTTTTTTT TCCCTAATGG 3000 
ACGAAGGAAC AACACACACA CAAAAATTAA ATGCAATAAA GGAATCATTA AAAA 



MKPPGSSSRQ PPLAGCSLAG ASCGPQRGPA GSVPASAPAR TPPCRLLLVL LLLPPLAASS 
RPRAWGAAAP SAPHWNETAE KNLGVLADED NTLQQNSSSN ISYSNAMQKE ITLPSRLIYY 
INQDSESPYH VLDTKARHQQ KHNKAVHLAQ ASFQIEAFGS KFILDLILNN GLLSSDYVEI 
HYENGKPQYS KGGEHCYYHG SIRGVKDSKV ALSTCNGLHG MFEDDTFVYM IEPLEtiVHDE 
KSTGRPHIIQ KTLAGQYSKQ MKNLTMERGD QWPFLSELQW LKRRKRAVNP SRGIFEEMKY 
LELMIVtJDHK TYKKHRSSHA HTNNFAKSW NLVDSIYKEQ LNTRWLVAV ETWTEKDQID 
ITTNPVQMLH EFSKYR3RIK QHADAVHLIS Rv/Tni-iI I r TR If 

LPMAVAQVLS QSLAQNLGIQ WEPSSRKPKC DCTESWGGCI MEETGVSHSR KFSKCSILEY 
RDFLQRGGGA CLFNRPTKLF EPTECGNGYV EAGEECDCGF HVECYGLCCK XCSLSNGAHC 
SDGPCOINTS CLFQPRGYEC RDAVNECDIT EYCTGDSGQC P PNLHKQDGY ACNQHQGRCY 
NGECKTRDNQ CQYIWGTKAA GSDKFCYEKL NTEGTEKGNC GKDGDRWIQC SKHDVFCGFL 
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LCTNLTRAPR IGQLQGEIIP TSFYHQGRVI DCSGAHWLD DDTDVGYVED GTPCGPSMMC 720 

LDRKCLQIQA LNMSSCPLDS KGKVCSGHGV CSNEATCICD FTWAGTDCSI RCPVRNLHPP 780 
KDEGPKGPSA TNLIIGSIAG AILVAAIVLG GTGWGPKNVK KERPDPTQQG PI 

5 Seq ID NO: 151 DNA sequence 

Nucleic Acid Accession # ■■ NM_02391S 
Coding sequence: 250-132 6 

io i i 1 r i 1 r r 

GGCACGAGGG TTTCGTTTTC ATGCTTTACC AGAAAATCCA CTTCCCTGCC GACCTTAGTT 60 

TCAAAGCTTA TTCTTAATTA GAGACAAGAA ACCTGTTTCA ACTTGAAGAC ACCGTATGAG 120 

GTGAATGGAC AGCCAGCCAC CACAATGAAA GAAATCAAAC CAGGAATAAC CTATGCTGAA 180 

CCCACGCCTC AATCGTCCCC AAGTGTTTCC TGACACGCAT CTTTGCTTAC AGTGCATCAC 240 

15 AACTGAAGAA TGGGGTTCAA CTTGACGCTT GCAAAATTAC CAAATAACGA GCTGCACGGC 300 

CAAGAGAGTC ACAATTCAGG CAACAGGAGC GACGGGCCAG GAAAGAACAC CACCCTTCAC 360 

AATGAATTTG ACACAATTGT CTTGCCGGTG CTTTATCTCA TTATATTTGT GGCAAGCATC 420 

TTGCTGAATG GTTTAGCAGT GTGGATCTTC TTCCACATTA GGAATAAABC CAGCTTCATA 480 

TTCTATCTCA AAAACATAGT GGTTGCAGAC CTCATAATGA CGCTGACATT TCCATTTCGA 540 

20 ATAGTCCATG ATGCAGGATT TGGACCTTGG TACTTCAAGT TTATTCTCTG CAGATACACT 600 

TCAGTTTTGT TTTATGCAAA CATGTATACT TCCATCGTGT TCCTTGGGCT GATAAGCATT 660 

GATCGCTATC TGAAGGTGGT CAAGCCATTT GGGGACTCTC GGAIGTACAG CATAACCTTC 720 

ACGAAGGTTT TATCTGTTTG TGTTTGGGTG ATCATGGCTG TTTTGTCTTT GCCAAACATC 780 

ATCCTGACAA ATGGTCAGCC AACAGAGGAC AATATCCATG ACTGCTCAAA ACTTAAAAGT 840 

25 CCTTTGGGGG TCAAATGGCA TACGGCAGTC ACCTATGTGA ACAGCTGCTT GTT7GTGGCC 900 

GTGCTGGTGA TTCTGATCGG ATGTTACATA GCCATATCCA GGTACATCCA CAAATCCAGC 960 

AGGCAATTCA TAAGTCAGTC AAGCCGAAAG CGAAAACATA ACCAGAGCAT CAGGGTTGTT 1020 

GTGGCTGTGT TTTTTACCTG CTT1CTACCA TATCACTTGT GCAGAATTCC TTTTACTTTT 1080 

AGTCACTTAG ACAGGCTTTT AGATGAATCT GCACAAAAAA TCCTATATTA CTGCAAAGAA 1140 

30 ATTACACTTT TCTTGTCTGC GTGTAATGTT TGCCTGGATC CAATAATTTA CTTTTTCATG 1200 

TGTAGGTCAT TTTCAAGAAG GCTGTTCAAA AAATCAAATA TCAGAACCAG GAGTGAAAGC 126 0 

ATCAGATCAC TGCAAAGTGT GAGRAGATCG GAAGTTCGCA TATATTATGA TTACACTGAT 1320 

GTGTAGGCCT TTTATTGTTT GTTGGAATCG ATATGTACAA AGTGTAAATA AATGTTTCTT 1380 
^ TTCATTATCC TTAAAAAAAA AA 

Seq ID NO: 152 Protein sequence: 
Protein Accession #: NP_076404 

40 1 11 21 31 41 51 

I I I I I I 

MGFNLTLAKL PNNELHGQES HKSGNRSDGP GKNTTLKNEF DTIVLPVLYL IIFVASILLN 60 

GLAVWIFFHI RMKTSFIFYL KNIWADLIM TLTFPFRIVH DAGFGPWYFK FILCRYTSVL 12 0 

FYANMYTSIV FLGLISIDRY LKWKPFGDS RMYSITF7KV LSVCVWVIMA VLSLPNIILT 180 

45 NGQPTEDNIH DCSKLKSPLG VKWHTAVTYV NSCLFVAVLV ILIGCYIAIS RYIHXSSRQF 240 

ISQSSRKRKH NQSIRWVAV FFTCFLPYHL CRI PFTFSHL DRLLDE3AQK ILYYCKEITL 300 
FLSACNVCLD PIIYFFMCRS FSRRLFKKSN IRTRSESIRS LQSVRRSEVR IYYDYTDV 

Seq ID NO: 153 DNA sequence 
50 Nucleic Acid Accession #: D80008.1 
Coding sequence: 149-739 

1 11 21 31 41 51 

„ I I I I I ) 

55 GTTCGGCGCC AAAGCGCGGA GCGGAGGCCG AGGCGAGAGC CTGGCGCTGT AGGACTAGAA 6 0 

CGAAAGGAGT GAGGCGCCGA GAGCCCAGAT ACCATTTTGG CGTGAGAGCT GGTGGTTGGC 12 0 

AAGGCCGCGG GAGTGGGAAG CGTCCGCCAT GTTCTGCGAA AAAGCCATGG AACTGATCCG 180 

CGAGCTGCAT CGCGCGCCCG AAGGG CAACT GCCTGCCTTC AACGAGGATG GACTCAGACA 24 0 

AGTTCTGGAG GAGATGAAAG CTTTGTATGA ACAAAACCAG TCTGATGTGA ATGAAGCAAA 300 

60 GTCAGGTGGA CGAAGTGATT TGATACCAAC TATCAAATTT CGACACTGTT CTCTGTTAAG 360 

AAATCGACGC TGCACTGTAG CATACCTGTA TGACCGCTTG CTTCGGATCA GAGCACTCAG 42 0 

ATGGGAATAT GGTAGCGTCT TGCCAAATGC ATTACGATTT CACATGGCTG CTGAAGAAAT 480 

GGAGTGGTTT AATAATTATA AAAGATCTCT TGCTACTTAT ATGAGGTCAC 1GGGAGGAGA 540 

TGAAGGTTTG GACATTACAC AGGATATGAA ACCACCAAAA AGCCTATATA TTGAAGTCCG 600 

65 GTGTCTAAAA GACTATGGAG AATTTGAAGT TGATGATGGC ACTTCAGTCC TATTAAAAAA 660 

AAATAGCCAG CACTTTTTAC CTCGATGGAA ATGTGAGCAG CTGATCAGAC AAGGAGTCCT 72 0 

GGAGCACATC CTGTCATGAC CATGCGCCGA GGCACTTCCA GGCTTCACTC AACTCATGGA 780 

CTCCTCTGTA CTCACTCTCT CCACCACTCC CTTCACCTCC CTCTTTGATT TTAGAAGCTA 840 

TAGACATTGT TTAAGATAAC TAAGAATACT TGGCTAAGAA GTATAATTTG CTAACTATTA 900 

70 AGGACTTTCT TTTTTTAATG TTGTACACTA TTCTTCCTAC TCTTTTTTGG TTTTGGTTTT 960 

GTTTTGTAGA GACTGTCTCA CTATGTTGCC CAAGCTGGTC TCAAACTCCT GGCCTCAAGC 102 0 

AGTCCTCCCA CCTTAGCTTC TCAAAGTGTT GAGATCACAG GCGTGAGCCA CTGCACCCGG 1080 

CCCCTACTCC TTTTTCTAAT AAGCTGTATC TGTAATCACA GCAT7CCTAC AGTTGTTACA 1140 

GTGTGTTTTT TAAATGAAAG TAAACATGGT TACATTTGAA TCTCTTAAAT AAGCAGTCAC 12 00 

75 TTGGCTGGAC AGGAAGAAGG TAGATCCTGT GTGTCTTGTT TTCTGGTCAT GTGTATTGTA 1260 

CAAGCTAGAG AGCTGAATTT CTGAGATACA CATTTTCAAA TCACATGCAA GTGAAGATGA 132 0 

TGGTCTGTAG AAATTTTCAG TATATATAAT GTTTAATGAC ATACTAATTT ATCATCTGGC 1380 

TATTTGGGAA GGAAGGACAC ACATGGATTT TGCACATTTC CACCATGGTG GCTGGTGTGG 144 0 

CTTGTGGCTA TGGGGTGATC ACCAGTATCA CCACTTTGGA AGGGGACAGT GAAATTGGGG 1500 

80 CTAGAGAAGG AACTTTGTAC AGTTTTCCCT GAGATTCAGA TTGACTGAAA AGTCACATGA 1560 

AGAGTTGATT GTCTTTTAAT GGTATGTTTT AAACAGCTGA CATTTTAAAT TTTGATGAAA 1620 

TCCAGTTTAT TCGTTTGTTC TTTTATGCTT TGGGT< rGC ,7Ci"-3^3AAA TCTTTTCCCA 1680 

TCCCAAGATC ACAATTTTTT TTCCTTTTTA CTTCTAGAAG TGTTATAATT TTAAGCTTTA 1740 

TACTTTGGTC TATGACCCGT TTTTTTTTTT GTTTTGTTTT GTTTTTTCGT TTGTTTCTTT 1800 

85 GTTTTGAGAT GGAGTCTTGT TCTGTCACCC AGGCTGGGGT GCAGTGGCGT GATCTTGGCT 18 6 0 

CACTGCAATC TCTATCCCCT GGGTTCAAGT GATTCTCTTG TCTCAGCCTC CCAAGTAGCT 192 0 

GGGATTACAG GCACAGGCCG CCACGCCTGG C7AATTTTTG TATTTTTAGT AGAGACAGAG 1980 
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TTTTACCATG TTGGCCAGGC TGGTTTCAAA CTCCTGACCT CAAGTGACCC ACCTTGGCCT 2040 

CCCAAAGTTT TGGGATTACA AGTGTGGGCC ACCGCGGCCA GCCTATGATC CATTTTGAAT 2100 

GAATTTTTTA TATGGTGCAA GGTGTCAATC CACCTTCACT TTTTCTTGGG AATATAGATA 2 ISO 

TCCAGCTGTT TCACTACCAT TTTTTGAAAG GACTGCCCTT TGCTCTATCA CCTTTGCATT 2220 

TTTGTTAAAA AGTAGTTGTC AATGTATATG TGGGTTTATT TCAGGACTCT GTTTTGTTCC 22 SO 

ATTGACCTGT TTTTCTCTCC TGAATGCCAA TACCATATTT GTATGTAGTG TATGTAATTT 2340 

TCTAATAATT CTTGAAACAG ATAGTATTAA TGTGTCATAT TTTTGCTGTT GTTTGTATTT 2400 

TTTGTAGAGA TGGGGTTTCA CCGTGTTGGC CAGGCTGTGT TGAACTCCTG AGCTAAAGCA 24S0 

ATACACTTGC CTCGTCCTCC CCATGTGCTG GGATTACAGG CGTGAGCCTT GGTGCTGGCC 2520 

CAGTGTACCA CATTTCTTTT TGAGATTTGT TTTGGCTATG TTAAGTCCTT TGCTTTTGAT 2580 

GTGAAATTTG GGAACAGGCA GGGTGTGGTG GCTTATGCCT GTAATCCTAG AACTTTGGGA 2640 

GGCCTAGATG GGTGGAT CAC TTGAGCTCAG GAGTTCCAGA CCAGCCCGGG CCTA7GGCAA 2700 

AACTCCGTCT CTACAAAAAA TAGAAAAAAT TAGCCAGGTG TGGTGGTGCA TGCCTGTAGT 27S0 

CACAGTTACA CGGCAGGCTG AGGTGGGAGG ATCACTTGAA CCCCAGAGGT CAAGACTGCA 2 820 

GTGAGCTGAG ATCACACCAC TGTACTCCAG CCTGGGTGAC AAAGTGAGAC TCTATCTCAA 2880 

AAAGAAATTA GGATCAATTT GTCAATTTCT ACAACAACAA CAACAAAAAC CCCTGTTGGG 2940 

CACCTTGATT GAGATTGCAT TGAATTTATA TAAAACTGTT GGGAGAATTG AOATCTTAAT 3000 

AATATTGAGT CTTCTGGCCT ATAAACAAGG TCTGTCTTCC TAGGTATTAA TGTTTTGTCT 30 SO 

TCTATTTCTC TTAATAATCT TTTGTAGTTT TCAGTGTACA GGTCTACCAT GTCAGCATTT 3120 

CATAGTTTTG ATGCTAAATG GTATTTTAAA ATTTCAAATT CTAACCACTT GTTGCTAGTA 3180 

AATAGAAATA CAATTGATGT TGAACTTGTA TCCTTCAGCC TTGCTAAACT GTGAGTTCTC 3240 

ATGGTGTTTT TGTAAATTAC ATCAACAGTC ATGTGTTCTA TGAATAAAGA GTTTTACTCC 3300 



MFCEKAMELI RELHRAPEGQ LPAFNEDGLR QVLEEMKALY EQNQSDVNEA KSGGRSDLIP 
TIKPRHCSLL RNRRCTVAYL YDRIiLRIRAL RWEYGSVLPM ALRFHMAAEE MEWFNNYKRS 
LATYMRSLGG DEGLDITQDM KPPKSLYIEV RCLKDYGEPE VDDGTSVLLK KJJSQHFLPRW 
KCEQLIRQGV LEHILS 



GTTCGGCGCC AAAGCGCGGA GCGGAGGCCG AGGCGAGAGC CTGGCGCTGT AGGACTAGAA 
CGAAAGGAGT GAGGCGCCGA GAGCCCAGAT ACCATTTTGG CGTGAGAGCT GGTGGTTGGC 
( G (jAGTGGGAAG CGTCCGCCAT GTTCTGCGAA AAAGCCATGG AACTGATCCG 



300 



GTCAGGTGGA CGAAGTGATT TGATACCAAC TATCAAATTT CGACACTGTT CTCTGTTAAG 

AAATCGACGC TGCACTGTAG CATACCTGTA TGACCGCTTG CTTCGGATCA GAGCACTCAG 420 

ATGGGAATAT GGTAGCGTCT TGCCAAATGC ATTACGATTT CACATGGCTG CIGAAGAAAT 480 

GGAGTGGTTT AATAATTATA AAAGATCTCT TGCTACTTAT ATGAGGTCAC TGGGAGGAGA 540 

TGAAGGTTTG GACATTACAC AGGATATGAA ACCACCAAAA AGCCTATATA TTGAAGCTGG 500 

ATGCAGTGGC GCGATCTCGG CTCAACCTGC AACCTCCACC TCCCAGGTTC ACCTCAACTG 660 

CAACCTCCAC CTCCCAGGTC CGGTGTCTAA AAGACTATGG AGAATTTGAA GTTGATGATG 720 

GCACTTCAGT CCTATTAAAA AAAAATAGCC AGCACTTTTT ACCTCGATGG AAATGTGAGC 760 

AGCTGATCAG ACAAGGAGTC CTGGAGCACA TCCTGTCATG ACCATGCGCC GAGGCACTTC 840 

CAGGCTTCAC TCAACTCATG GACTCCTCTG TACTCACTCT CTCCACCACT CCCTTCACCT 900 

CCCTCTTTGA TTTTAGAAGC TATAGACATT GTTTAAGATA ACTAAGAATA CTTGGCTAAG 960 

AAGTATAATT TGCTAACTAT TAAGGACTTT CTTTTTTTAA TGTTGTACAC TATTCTTCCT 1020 

ACTCTTTTTT GGTTTTGGTT TTGTTTTGTA GAGACTGTCT CACTATGTTG CCCAAGCTGG 10B0 

TCTCAAACTC CTGGCCTCAA GCAGTCCTCC CACCTTAGCT TCTCAAAGTG TTGAGAT CAC 1140 

AGGCGTGAGC CACTGCACCC GGCCCCTACT CCTTTTTCTA ATAAGCTGTA TCTGTAATCA 1200 

CAGCATTCCT ACAGTTGTTA CAGTGTGTTT TTTAAATGAA AGTAAACATG GTTACATTTG 1260 

AATCTCTTAA ATAAGCAGTC ACTTGGCTGG ACAGGAAGAA GGTAGATCCT GTGTGTCTTG 1320 

TTTTCTGGTC ATGTGTATTG TACAAGCTAG AGAGCTGAAT TTCIGAGATA CACATTTTCA 1380 

AATCACATGC AAGTGAAGAT GATGGT CTGT AGAAATTTTC AGTATATATA ATGTTTAATG 1440 

ACATACTAAT TTATCATCTG GCTATTTGGG AAGGAAGGAC ACACATGGAT TTTGCACATT 1500 

TCCACCATGG TGGCTGGTGT GGCTTGTGGC TATGGGGTGA TCACCAGTAT CACCACTTTG 1560 

GAAGGGGACA GTGAAATTGG GGCTAGAGAA GGAACTTTGT ACAGTTTTCC CTGAGATTCA 1620 

GATTGACTGA AAAGTCACAT GAAGAGTTGA TTGTCTTTTA ATGGTATGTT TTAAACAGCT 1680 

GACATTTTAA ATTTTGATGA AATCCAGTTT ATTCGTTTGT TCTTTTATGC TTTGGGTGTT 1740 

GCATCCGAGA AATCTTTTCC CATCCCAAGA TCACAATTTT TTTTCCTTTT TACTTCTAGA 1800 

AGTGTTATAA TTTTAAGCTT TATACTTTGG TCTATGACCC GTTTTTTTTT TTGTTTTGTT 1860 

TTGTTTTTTC GTTTGTTTCT TTGTTTTGAG ATGGAGTCTT GTTCTGTCAC CCAGGCTGGG 1920 

GTGCAGTGGC GTGATCTTGG CTCACTGCAA TCTCTATCCC CTGGGTTCAA GTGATTCTCT 1980 

TGTCTCAGCC TCCCAAGTAG CTGGGATTAC AGGCACAGGC CGCCACGCCT GGCTAATTTT 2 040 

TGTATTTTTA GTAGAGACAG AGTTTTACCA TGTTGGCCAG GCTGGTTTCA AACTCCTGAC 2100 

CTCAAGTGAC CCACCTTGGC CTCCCAAAGT TTTGGGATTA CAAGTGTGGG CCACCGCGGC 2160 

CAGCCTATGA TCCATTTTGA ATGAATTTTT TATATGGTGC AAGGTGTCAA TCCACCTTCA 2220 

CTTTTTCTTG GGAATATAGA TATCCAGCTG TTTCACTACC ATTTTTTGAA AGGACTGCCC 2280 

TTTGCTCTAT CACCTTTGCA TTTTTGTTAA AAAGTAGTTG TCAATGTATA TGTGGGTTTA 2340 

TTTCAGGACT CTGTTTTGTT CCATTGACCT GTTTTTCTCT CCTGAATGCC AATACCATAT 2400 

TTGTATGTAG TGTATGTAAT TTTCTAATAA TTCTTGAAAC AGATAGTATT AATGTGTCAT 2460 

ATTTTTGCTG TTGTTTGTAT TTTTTGTAGA GATGGGGTTT CACCGTGTTG GCCAGGCTGT 2520 

GTTGAACTCC TGAGCTAAAG CAATACACTT GCCTCGTCCT CCCCATGTGC TGGGATTACA 2580 

GGCGTGAGCC TTGGTGCTGG CCCAGTGTAC CACATTTCTT TTTGAGATTT GTTTTGGCTA 2 640 

TGTTAAGTCC TTTGCTTTTG ATGTGAAATT TGGGAACAGG CAGGGTGTGG TGGCTTATGC 2700 

CTGTAATCCT AGAACTTTGG GAGGCCTAGA TGGGTGGATC ACTTGAGCTC AGGAGTTCCA 2760 

GACCAGCCCG GGCCTATGGC AAAACTCCGT CTCTACAAAA AATAGAAAAA ATTAGCCAGG 2B20 



245 



30S0 
3120 
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TGTGGTGGTG CATGCCTGTA GTCACAGTTA CACGGCAGGC IGAGGTGGGA GGATCACTTG 28S0 
AACCCCAGAG GTCAAGACTG CAGTGAGCTG AGATCACACC ACTGTACTCC AGCCTGGGTG 2940 
ACAAAGTGAG ACTCTATCTC AAAAAGAAAT TAGGATCAAT TTGTCAATTT CTACAACAAC 3000 
AACAACAAAA ACCCCTGTTG GGCACCTTGA TTGAGATTGC ATTGAATTTA TATAAAACTG 
TTGGGAGAAT TGACATCTTA ATAATATTGA GTCTTCTGGC CTATAAACAA GGTCTGTCTT 
CCTAGGTATT AATGTTTTGT CTTCTATTTC TCTTAATAAT CTTTTGTAGT TTTCAGTGTA 3180 
CAGGTCTACC ATGTCAGCAT TTCATAGTTT TGATGCTAAA TGGTATTTTA AAATTTCAAA 3240 
TTCTAACCAC TTGTTGCTAG TAAATAGAAA TACAATTGAT GTTGAACTTG TATCCTTCAG 3300 
CCTTGCTAAA CTGTGAGTTC TCATGGTGTT TTTGTAAATT ACATCAACAG TCATGTGTTC 3 3 SO 
TATGAATAAA GAGTTTTACT CCTTC 
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MFCEKAMELI RELHRAPEGQ L 
TIKFRHCSLL RNRRCTVAYL Y 
LATYMRSLGG DEGLDITQDM KPPKSLYIEA GCSGAISAQP ATSTSQVHLN CNLHLPGPVS 



AAGCGCGGAG 
AGGCGCCGAG 
AGTGGGAAGC 



AATCGACGCT 
TGGGAATATG 
CGGTGTCTAA 
AAAAATAGCC 
CTGGAGCACA 
GACTCCTCTG 



GTAGCGTCTT 
AAGACTATGG 
AGCACTTTTT 
TCCTGTCATG 



I 

■ CGGAGGCCGA 
AGCCCAGATA 
GTCCGCCATG 
AGGGCAACTG 
TTTGTATGAA 
GATACCAACT 
ATACCTGTAT 



I 

■ GGCGAGAGCC 
CCATTTTGGC 
TTCTGCGAAA 
CCTGCCTTCA 
CAAAACCAGT 
ATCAAATTTC 
GACCGCTTGC 
TTACGATTTC 
GTTGATGATG 
AAATGTGAGC 



GTGAGAGCTG 
AAGCCATGGA 
ACGAGGATGG 



GGACTAGAAC • 
ACTGATCCGC 



TAAGGACTTT 



GTTTAAGATA 



AC ' 
CCCCCCTACT 
CAGTGTGTTT 
ACTTGGCTGG 
TACAAGCTAG 



GAGACTGTCT 
CACCTTAGCT 
CCTTTTTCTA 
TTTAAATGAA 



CTCCACCACT 
ACTAAGAATA 
TGTTGTACAC 



GACACTGTTC 
TTCGGATCAG 
ACATGGCTGC 
GCACTTCAGT 
AGCTGATCAG 
CAGGCTTCAC 



TGAAGAAGTC 
ACAAGGAGTC 



GCTATTTGGG 
GGCTTGTGGC 
GGCTAGAGAA 
GAAGAGTTGA 
AATCCAGTTT 



AGAGCTGAAT 
AGAAATTTTC 
AAGGAAGGAC 
TATGGGGTGA 
GGAACTTTGT 

TCACAATTTT 
TCTATGACCC 



TCTCAAAGTG 
ATAAGCTGTA 
AGTAAACATG 
GGTAGATCCT 
TTCTGAGATA 
AGTATATATA 
ACACATGGAT 
TCACCAGTAT 
ACAGTTTTCC 
ATGGTATGTT 
TCTTTTATGC 
TTTTCCTTTT 



CCCAAC 
TTGAGATCAC 
TCTGTAATCA 
GTTACATT7G 
GTGTGTCTTG 
CACATTTTCA 
ATGTTTAATG 
TTTGCACA7T 
CACCACTTTG 
CTGAGATTCA 
TTAAACAGCT 
TTTGGGTGTT 
TACTTCTAGA 



r CCCTCTTTGA T' 
3 AAGTATAATT 
ACTGTTTTTT 
TCTCAAACTC 
AGGCGTGAGC 
CAGCATTCCT 
AATCTCTTAA 
TTTTCTGGTC 
AATCACATGC 
ACATAC7AAT 
TCCACCATGG 



TGCTAACTAT 7 60 



GACATTTTAA 
AGTGTTATAA 



CTGGCCTCAA 
CACTGCACCC 
ACAGTTGTTA 
ATAAGCAGTC 
ATGTGTATTG 
AAGTGAAGAT 
TTATCATCTG 
TGGCTGGTGT 
GTGAAATTGG 
AAAGTCACAT 
ATTTTGATGA 
AATCTTTTCC 



CTGGGATTAC 



TCTCTATCCC 
AGGCACAGGC 
TGTTGGCCAG 



GTTCTGTCAC 
CTGGGTTCAA 
CGCCACGCCT 



I TATATGGTGC A 



CAGCCTATGA 
CTTTTTCTTG 
TTTGCTCTAT 



GTTTGTTTCT 
GTGATCTTGG 
TCCCAAGTAG 
GTAGAGACAG 
CCACCTTGGC 
TCCATTTTGA 
GGAATATAGA : 
CACCTTTGCA : 



CCATTGACCT GTTTTTCTCT CCTGAATGCC 



ATTTTTGCTG T 



CAATACACTT G 



ATGTGAAATT 
GAGGCCTAGA 
AAAACTCCGT 
GTCACAGTTA 
CAGTGAGCTG 



TGGGAACAGG 



GGCACCTTGA 
ATAATATTGA 
CTTCTATTTC 
TTCATAGTTT 
TAAATAGAAA 
TCATGGTGTT 



CTCTACAAAA 
CACGGCAGGC 
AGATCACACC 
TAGGATCAAT 
TTGAGATTGC 



ACTTGAGCTC 
AATAGAAAAA 
TGAGGTGGGA 
ACTGTACTCC 
TTGTCAATTT 



ATTAGCCAGG 



TGTTAAGTCC 

CTGTAATCCT 
GACCAGCCCG 



TCTTAATAAT 
TGATGCTAAA 
TACAATTGAT 
TTTGTAAATT 



CTATAAACAA 
CTTTTGTAGT 
TGGTATTTTA 



AGCCTGGGTG 
CTACAACAAC 
TATAAAACTG 



AACCCCAGAG 
ACAAAGTGAG 
AACAACAAAA 



ACATCAACAG 



TTTCAGTGTA 
AAATTTCAAA 
TATCCTTCAG 
TCATGTGTTC 



CCTAGGTATT 
CAGGTCTACC 
TTCTAACCAC 
CCTTGCTAAA 
TATGAATAAA 



ACTCTATCTC 
ACCCCTGTTG 
TGACATCTTA 
AATGTTTTGT 
ATGTCAGCAT 
TTGTTGCTAG 
CTGTGAGTTC 
GAGTTTTACT 



2280 

2400 

2520 
2580 
2S40 

27S0 
2820 
2880 
2940 
3000 
3060 
3120 
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I 



I 



IGGCGCTGT 



T GGTGGTTGGC 



GTTCGGCGCC AAAGCGCGGA GCGGAGGCCG AGGC 
CGAAAGGAGT GAGGCGCCGA G. 

AAGGCCGCGG GAGTGGGAAG CGTCCGCCAT GTTCTGCGAA AAAGCCATGG A 
CGAGCTGCAT CGCGCGCCCG AAGGGCAACT GCCTGCCTTC AACAATTAGC TGGGTGTGGT 
GGCACACACC TGTAGTCCCA GCAACTTAGG AGGCTGAAGT GAGAGGATTG CATGGCTCCA 
GGAAGTTGAA ACTGCAGTGA ACTGTGGTCA CGCTATTACA CTCCAGCCTG GGTGACAGAC 
3 TCTCAAAAAG GAAAAGGAGG ATGGACTCAG ACAAGTTCTG GAGGAGATGA 
iCAAAAC CAGTCTGATG TGTTCTCTGT TAAGAAATCG ACGCTGCACT 
GTAGCATACC TGTATGACCG CTTGCTTCGG ATCAGAGCAC TCAGATGG 

Seq I 



Seq ID NO: 1S1 DNA s 
Nucleic Acid Accession It: U10694 
Coding sequence: 1333-2280 



GGATCCGGCC G 



AGGAACCTAA 
TTTGCCCTGC 

CACGTCAGCA 
CCCACTCACC 



GAGGTGAGGA CTTTGTTCTC 
ACAGACACAG TGGTCCCAGG 
AGGGTACCTC 
ACCCCAGAGA 

A GGGCTGGCCT 1 



CTCACCCCAG 
AACCCTGGGC 
ATGTGAGCTC 



AAACACAGAG 
GATGGACTCC 
AGGGAGGGTT 



CCCAGGCCCT 
GACCTAGCCC CACCCTGCCC 
CCTCACTTCC TCTTCAGGTG 
CCAGACCCTG CAGGCATCAA 



CAGGTGCAGA CM 



7CTCCTGGAG 
GATGAGGACC 
TCTTACTGTA 
CCTTCTGTTC 
GAGTAGAGTC 



51 
I 

" TGTGGACAAA 
AGTCCAGGTG 
ATCAAGAGAG 
TGAGGTCCCT 
GGGCTGCACT 
CTGAGGGGAC 
CTGAGGGAAG 
ATAGGGCCTC 
AGGCAGTATC 
CTTCCGAGGA 
CATATCAGGG 



AGAACTCAAG 



AGGCTAGCTG 
ACCAGGAGGA 
GCCTTTGTTA 
TCCCCAGGCC 
ACCAGAGTCA 
GAAGCCCAAG 
GAGACTACCT 
CCTCCCCAGA 
AGCCAATTCG 
CCAGCTCAGC 
CATTTCCTGC 
AGCGTCATCA 



AGTGTCCAGC 
CCTAAGGGCC 
GACAGTGTCC 
GTGTTCACCC 
GCACCTGGCC 
CACGCTGAGT 



GAACCTCCAA 
TGTGGGTCTC 
TCATGTCTCT 
GAGAGGACTT 



CTTGTCACTG C 



CACGCTGAAA 



GTCCTCAGGG 
ATGAGGGCTC 
TGGAGTTCAT 
TCCACAAATA 
AAAATTACAA 
TCTTTGGCAC 

TCCTGATCAT 
TCTGGGAAGC 
AGCCCAGGAA 
TGCCCGGCAG 




ACCCATCCCT 
CAGCCGGGGC 
ACATGAGGCC 



CAGCAAGGAG 
AGGCGCTTCC 
CAGCAGTCAA 
GTTCCAAGAA 
TCGAGTCAAG 
GCGCTACTTT 
TGATGTGAAG 
CTCGTGCGAT 
TGTCCTGGGT 
GTTGAGTGTG 
GCTGCTCACC 
TGATCCTGCG 
GAAGGTCATA 



GCACTGAAAT 
GAGCCGGTCA 
CCTGTGATCT 



CAAAGGCAGA AATGCTGGAG 
TCGGCAAAGC CTCCGAGTTC 
CTCCTACATC 



GTGATTTGGA G 



CAAAGTTTGT 
CATTCTTCGC 
GAGCACACTG 
TGCTCCCTTT 



GTGATGGTCA TAGCATGCCC 
CCAAAGACAA CTGCGCCCCM 
ATGTTGGGAA GGAGCACATG 
TGCAGGAAAA CTACCTGGAG 
TCCTGTGGGG TTCCAAGGCC 
TCATGCTCAA TGCAAGAGAG 
AGGAGCAAGA GGGAGTCTGA 



CAAGATAACA 
GAGCTCATAA 
CTCTCCTGTA 
TGTAAGAGAA 
AGACACGCAC 



TATGTCATCT 
TGGAATTGTT 
GAATGACAGT 
GAGTCACATG 



AGAAATAGTG AAATGAAAAT 
AAATTAAAAC ATATACATGT 
ATAAAAATTG AAAGAATAAT 
TGAACATCTG TTATTCGGAA 



GTAGTTAATT CTTGCCTTAT 
ATACCTGGAT TTGCITGGCT 
TTTTCCTGTT CACTGGCTCA 
CACCCTGGGT T 



2160 
2220 
2280 
2340 

2460 

2SS0 



2760 
2820 
2880 
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MSLEQRSPHC KPDEDLEAQG EDLGLMGAQE PTGEEEETTS S 

PQGGASSSIS VYYTLWSQPD EGSSSQEEEE PSSSVDPAQL EFMFQEALKL K 

HKYRVKEPVT KAEMLESVIK NYKRYFPVIF GKASEFMQVI F 

LGLSCDSMLG DGHSMPKAAL LIIVLGVILT KDNCAPEEVI WEALSVMGVY VGKEHMFYGE 



PRKLLTQDWV QENYLEYRQV PGSDPAHYEF LHGSKAHAET 



MLNAREPICY 3 00 



I 



AAGTTCGTCA 
CTCAAAATGA 
ATATACGAGA 
GATACAAAGA 
ATGACCTGGC 
CAGCAACACA 



AAGGATTAGA 



GAGTGTAAAA 
CCCTCAAGAT 
ACTCGATCCA 
GTGCGAGTTC 
AGAACAACTA 

TCTAGAAATG 
CTTATGGAAT 



GGATCATTGG 
GAGAATAAAA 
GCCAGCATTA 
TCCAAACAGG 
AAGGCCCAGA 



GCCATTGCCT 



ATGAAGAAGG 



CTGTACAATA 
AGATCAATCC 
CCATATCGTA 



AGCATGGCCT 
AAAAAGTACA 
AATACAACAG 
TCACAATTCA 
TTTCATTCTT 
TAATTCCAAC 
AAGTCTTAGA 
CTGTCTTCAT 
TATAAAAACT 
CCAGTGAGAG 



GTTGTTATTG 
ACAGTGTAGC 
AAAATTGCAC 
GACTGAGGGT 
CTAATGGTGG 



ACAGGAAGAA G 
TTGGAATAGA Tl 
GTGTGTTGAT T. 
AGTTCATGGA Tl 
TACCCAAGAT O 
CTTTTAATTT 
ACTGGAACTT 
TGGAACATCA 
CGATGATTGC 



GAAATCATCG 
AACAGCAGTA 
TTTCCAAAAT 



I 



TGCAAAGAAT 
AGTGCTTAAT 
TAAACGATCA 
AGATGACATG 



AGTTGTCTTT 
CCTGAACTTT 
CTGTACAATA 
CAGTTCTGTG 
AAGTTCAGAG 
GAATTAGGAT 
TTGAAAGAAC 



GGAAGATTTA 
ATACCAAAAG 
TCTAATTATG 
CGCCCTCAAA 



C TGCTGGCTGG ACTGAACTGA 900 



AGCTTTATGG 
CAGAAAACAG 
TCTTAAATAT 
TGATATATTT 



ACTAAAGGAA CCTTTTAGAA TGTACATAGT 840 
AGTCAATTTC T. 
TGAGACAAAA C 
TTTGGGCTTG T 
TCATCTTATA T 



CTTTAAGGAT 
TCTTCTAGTC 
TCTGTGGACC 
GAGGTTTTCA 
AATTAAACAT 
CTAGTTGCAA 
GATCTGGAAT 



TGGGCCGCTC 
ATCCTAGTTT 
TTCTCGCATA 
ATGAACATCA 
AGACTATGCT 



CTGAAAAATG 
CTGCATAATG 



AATATAGGTA 



AATGTCAGAA 
TATTGTGTTT 
CCCATGTTTA 
TTAGATTCTT 
ATGCTTATCT 
CTACACAATA 

GAGTTAAATT 
CTGCTCTTTC 
ACTGCTAATC 
TACACGATGC 
TAGCTCAGTT 
GTGTGTGTGT 
TTTTTTTTAA 




TTTGAAAATT 
ATCTTGCTGC 
GAAAAATTTT 
TGTCTTGAAG 



TTTTAAATGC 
TATAATTACT 
TAAAAGTTAA 
CTGAAAAGAG 
AGGGAATATT 
CTGTGAGAGA 
CCCCTTGAAT 



TTAGAATAGC 
AAAACTGCTT 
TATTCCAATC 
TCTATTCAGG 
TTTCAGTATT 
TAAAAGATAT 
ATTTTGAGTT 
AAAGTCATAT 



TTCTCATCAA 
GTATGTGTGT 
ATTACAAAAG 
ACATGTGAAG 
\ ATACTTCAAA 
\ TATCCAGGCA 



ATATATATAT 
CCATGAGCTG 
AAGGGTTTCT 



AGTACATTTT 
TTTACATGAT 
ATAACTTTAA 



CTCTCTAAGG 
ATGCAAATAT 
TTTGAAATAG 



ATATATATAT 
CTTTTATGCT 
TGCTTTCTTA 
TTATGGTCTT 
TCTAGGGTTG 



ACTAGAAAGC 



GAAAATGGTC 
AACATTTCCG 
TTGATAGGAA 
ATATTAACAA 
TTAAGAAACA 



AAAAGAAAAA 



CAACAAAATA 
TATGATGGCA 



CACATATAAA 
GCAGTGAAAT 



AAAACCTTTT 
GTGTAATTAT 
GGTTGTTCAC 
AAGACTGATT 

AAATGATTAC 
GTGAGAGAGA 
TGCTTTAACT CCTAAGTGTT 
TTTTAAACTG AACCATAGGT 
TTAAAATGTA 
AAATACTTGA 
TTGTGTTTCT 



GATGAAGATT 



GTTTCAGATT 
CCTAATGCCT 
GTGTTTGTCA 
ACATAAAGCT 
TCTTAGAATT 
TGATTATTCT 
ACATATGGCT 
TTGACAGAGA 
TCAGTCCTCA 
ACAGTTTCCT 
AATAGAAAAA 
ATCTTTAGTG 



A CATATGCTGT 



AAACAAAACT 
GAAGATAAAT 
GTGTGTGTTT 



GTCTGGCTAC 
GAGTCTGTGT 
CACACACACG 
CTCCCCATGC 
TGTCAAAACA 
TAGAAGTGCC 
GCATATCTCT 
AATAAAAGCT 



3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 



MNKLKSSQKD KVRQFMIFTQ SSEKTAVSCL SQNDWKLDVA TDNFFQNPEL Y 
DRKKLEQLYN RYKDPQDENK IGIDGIQQFC DDLALDPASI SVLIIAWKFR AATQCEFSKQ 
EFMDGMTELG CDSIEQLKAQ IPKMEQELKF 



5 TFNFAKNFGQ K> 
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3 CATGCTCAGT AGCTGCTGCC GWLt 2GCTGCGC SO 

GCCTACGGGC TGCGGTGGCG GCCGCCGCGG CACCCGGCAG GGCCCGCCAG TCCCCGCTTC 120 

CCTGCTCCAG AGCCGCCGCC TGGGCCGGGG CAGGGCGGGC CCGGGGCTCC TCCATGCTGC 180 

CAGCCGCCGG GCTGCGGAGC CGACCAAGTG GCTCCTGCGA 3G 3A C 2 40 

GCGGCGGGAG GTAAAGTGTT GAGAGAGGAG AACCAGTGCA TTGCTCCTGT GGTTTCCAGC 3 00 

CGCGIGAGTC CAGGGACAAG ACCAACAGCT ATGGGGTCTT TCAGCTCACA CATGACAGAG 360 

TTTCCACGAA aacGCAAAGG AAGTGATTCA GACCCATCCC AAGTGGAAGA TGGTGAACAC 420 

CAAGTTAAAA TGAAGGCCTT CAGAGAAGCT CATAGCCAAA CTGAAAAGCG GAGGAGAGAT 480 

AAAATGAATA ACCTGATTGA AGAACTGTCT GCAATGATCC CTCAGTGCAA CCCCATGGCG S40 

CGTAAACTGG ACAAACTTAC AGTTTTAAGA ATGGCTGTTC AACACTTGAG ATCTTTAAAA 600 

GGCTTGACAA ATTCTTATGT GGGAAGTAAT TATAGACCAT CATTTCTTCA GGATAATGAG 660 

CTCAGACATT TAATCCTTAA GACTGCAGAA GGCTTCTTAT TTGTGGTTGG ATGTGAAAGA 720 

GGAAAAATTC TCTTCGTTTC TAAGTCAGTC TCCAAAATAC TTAATTATGA TCAGGCTAGT 780 

TTGACTGGAC AAAGCTTATT TGACTTCTTA CATCCAAAAG ATGTTGCCAA AGTAAAGGAA 840 

CAACTTTCTT CTTTTGATAT TTCACCAAGA GAAAAGCTAA TAGATGCCAA AACTGGTTTG 900 

CAAGTTCACA GTAATCTCCA CGCTGGAAGG ACACGTGTGT ATTCTGGCTC AAGACGATCT 960 

TTTTTCTGTC GGATAAAGAG TTGTAAAATC TCTGTCAAAG AAGAGCATGG ATGCTTACCC 1020 

AACTCAAAGA AGAAAGAGCA CAGAAAATTC TATACTATCC ATTGCACTGG TTACTTGAGA 1080 

AGCTGGCCTC CAAATATTGT TGGAATGGAA GAAGAAAGGA ACAGTAAGAA AGACAACAGT 1140 

AATTTTACCT GCCTTGTGGC CATTGGAAGA TTACAGCCAT ATATTGTTCC ACAGAACAGT 12 00 

GGAGAGATTA ATGTGAAACC AACTGAATTT ATAACCCGGT TTGCAGTGAA TGGAAAA.TTT 12 60 

GTCTATGTAG ATCAAAGGGC AACAGCGATT TTAGGATATC TGCCTCAGGA ACTTTTGGGA 1320 

ACTTCTTGTT ATGAATATTT TCATCAAGAT GACCACAATA ATTTGACTGA CAAGCACAAA 1380 

GCAGTTCTAC AGAGTAAGGA GAAAATACTT ACAGATTCCT ACAAATTCAG AGCAAAAGAT 1440 

GGCTCTTTTG TAACTTTAAA AAGCCAATGG TTTAGTTTCA CAAATCCTTG GACAAAAGAA 1500 

CTGGAATATA TK3TATCTGT CAACACTTTA GTTTTGGGAC ATAGTGAGCC TGGAGAAGCA 1560 

TCATTTTTAC CTTGTAGCTC TCAATCATCA GAAGAATCCT CTAGACAGTC CTGTATGAGT 1620 

GTACCTGGAA TGTCTACTGG AACAGTACTT GGTGCTGGTA GTATTGGAAC AGATATTGCA 1580 

AATGAAATTC TGGATTTACA GAGGTTACAG TCTTCTTCAT ACCTTGATGA TTCGAGTCCA 1740 

ACAGGTTTAA TGAAAGATAC TCATACTGTA AACTGCAGGA GTATGTCAAA TAAGGAGTTG 1800 

TTTCCACCAA GTCCTTCTGA AATGGGGGAG CTAGAGGCTA CCAGGCAAAA CCAGAGTACT 1860 

GTTGCTGTCC ACAGCCATGA GCCACTCCTC AGTGATGGTG CACAGTTGGA TTTCGATGCC 192 0 

CTATGTGACA ATGATGACAC AGCCATGGCT GCATTTATGA ATTACTTAGA AGCAGAGGGG 1980 

GGCCTGGGAG ACCCTGGGGA CTTCAGTGAC ATCCAGTGGA CCCICTAGCC TTTGATTTTT 2040 

AACTCCAAAA ATGAGAAACA TTTTAAAGCA TTATTTACGA AAAAACTGTC TCAACTATTC 2100 

TTAAGTACTG TATTGATATT GTTTGTATCT TTTATTAATG TTCTACCACT TTTTATAGAT 2160 

TTGCATCTTC CTGTCACAGG GATGTGGGGA AATACGTTTT CCTCCCAAGA GAACCAAGTT 2 220 

TATTATAGAC TCCTTTATTC AGTGAAATGG CTTATAATCC ACTAGTTGCC ATATTTTTGC 22 80 

TAAAATATTT CTAACCAAGA ATACTACTTA CATATTCTTT TGGCTTTGTT TTATTTTT6A 2340 

TGCAGTTTTT TTTAGTTGAG GTAATGTAAT ATATTGATGT TTTCCTTTGT GTCTAAGATT 240 0 

GATTTATAAT AGTAGGTTTG TATAATTTGG AACATTTTCC ATGCCTTGCG AATTTCCTTA 2460 

ATTGAGGATA GGGCTTACAC ACTTTAAGAA AACAGTGAGT ACTTGAACAT TTAAAGGGAC 2 52 0 

AGTGCAATTT ATAGTCATAA TCACATTGAA TACTGTATTT GATCTTTGGA GACTTAGGCA 2580 

AG CACAGAGC TGGGATATTT ATGCTCAGTT GAGCACTTTA AGATGAATTT TAAGTGAGAT 2 640 - 

GATTTCTTOC TTAAAACTCA GAAAGTCAAA AGAGTTTCAG CTTTCCTTAC AGAAAAGGAA 2700 

GGATCTTGGG CCCTAGATCT TGGGGATTAA CCTCTGCATA TAAGATTTAC TCTTAATAGG 2760 

CCAGACGTGG TGCTCACGCC TGTAATCCCA GTACTTTGGG AGGCTGAGAC GGGCAGATCA 2820 

CTTGAGGTCA GGAGTTCAAG ACCAGCCTGG CCAATATCGT GAAACCCCGT TTCTACTAAA 2880 

AATACAAAAA AAATTACCCA GGCACTCACT CTTGAGGTAA CTAACCAACT CCCACGATAA 2 940 

TGACAGTCCA TTCATGAGCG CAAAGGCCTC ATGACCTAAT GGCACACACC TGTAATCCCA 3000 

ACTGCTTGGG AGGCTGAGGC GAGAGGATTG CTTGAACCTG GGAGGCAGAG GTTGCAGTGA 3060 

GCCGAGATCG CACCACTGCA CTCCAGTCTG GGCAACAGAG TGAGACTTCA TCTCAAAAAA 3120 

AGTAAAAAAA AAGATTTAAT ATAATCACTG AAGATCTCTA TTATAGATAG ATTAGGTTTT 3180 

TGACATTGGA AACATACTTA GGGATAGATT TGTCCTAAAG GAAAAAAGTA GGCCCGGGCA 3240 

GATTAAATGT CTTGTGTAAA GTCACACATT AAATTCAGTC ACACATTAAA TTCATAGAGT 3 300 

TTTAAATGTT TAATGTATAT AAACCAGTTT CTTTATACAC ATTTGGGAAA ACATTGGTCT 3360 

CACAGATTAA ATGATTAACT AACTGACCCA GGAACTAGTT GTAGCTTTCT AAGTAATTAG 3420 

G CAATTACAG TTATTGCCTG TAACCAAAGG TAATAAAACA AAATGACAAG TACATGTTTA 3480 

AAATTATGAG GCAATGAGAA ATAATTTAAA AACCAATTTT CTAGTTATAA TTTAAAATTT 3540 

GGAGAGCATT TTTAACAGTA ATTAATCCAG AGGTGGCTCA AATTGAGTAT AAGAATTAAG 3 600 

AT TATTTAAA ATACTGCATG TCTACCTTCT CGGGGATCAT ACTTTATAAC ACTTTCTGCT 3660 

TCAGTAGCTC TTCATAGCTT GCCAAGTATG CTCCCATATT TTCICTCTCG TGCCTCGCAA 3720 

ATGAAAGTCA GATAGGCTGG GAACT CATGG GGCAGCCCTC AGACTTCAAT GTGGGCTTCA 3780 

AATCCAGTTT CCTGTTCTAT ATGGTGCTAC ATCTTTCCAG AAAATTTCCC TCAGAGCCCC 3 840 

TCGCCAAAAC AAAGCATTAT TTTGACCCTG CATGCTATTT CTTTAGCTGT AGGTGATAGA 3900 

TTAGAACTTC TGTCAGACAT GTTAATGACA AACATACCAA CAGACAATAA CCAAAGCAAA 3960 

TGTTTCCTTC AAGTGTGAAA TGTGCAGGGG CTCGTGGGCA AGGATGTATT GGCACACTGT 4020 

CCTCTTGAAC TGATAGTGTC CCAGCAATGT TGGAGGTTGO CACCATTCCT GGTCCGACAC 4080 

TTGAGGACCT GAGAGACATC AGGTTTAGAA TGAGCCAAAG AAATCCTACA AGATGGGGAG 4140 

AATTGGTGTG CAGCAGCCTA AGTGTTATAG TTAAGTCTAA AGAAGTATGA AAGATCCCCT 4200 

GTGTTCTCTA AATTGAGCAG AGGGGCCTGC CTACCAATAT CACTTTTTAG GGGACTGAAC 4260 

CATTGCAGGT TAGACTTGGC TTCCAAAGAG TCTGCCTAAG CCAGGGGTGG CAGGGTAGGC 4320 

CATCATAGCT GGATGGCCTC AAAAGCAGAT GGGGGCAGAC TTGCCCTCGT GATGCCAGGA 4380 

TTTGAGAGGC AGAGTTTCTA GAGGGAGACC AGTGCTGCCT CTCACAGTGG CAGTTTTTTC 4440 

TCTTTGCAAG AGGAGGGGCT GTTCAATTCC ATAGACCAGT GGGCAGATAG CCAGTTGAAT 4500 

ACTCTGTGCA TGGTTTGATC CTTTATTAGT TCGCTCTAAT ATTTTTCTGT AGATCCTTTT 4560 

GTCCTGGACT CAAAATCTAA TCCATGCATT GTATGATACC GTAGCTCTCC TAAGGTTTGT 4 620 

GTTTCCTTCA AAATGTTTTA GTTTTCTTCA ACTAAATTTG ATTTTTGCTG TTAGAAGTGA 4680 

CATATTTTTA TGGTATACAC TATGTTCCTT TTTTCTACTG CGAGTC^ATT TTTTGAATTT 4740 

TCGTGAGAAA G AATATAT CT ACAAATTGCA CGAAAGTATC ATAAAAACAG TACT CT AG AG 4B00 
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CAGCGCTGTC CAATAGAAAT A' 
h AGTAAAAAGA T, 



A ACTAATTTTA 



TTGGTAATAC 
GTACTAGCCA 
AGCACAGTTC 
AGCTGGAAGT 
GTTTAAATGG 



AAACATAGAG 



TGTAGAGTAG 



CTCTACTGAA 
CTGCTCTGGA 
AGATCGCGCC 
AAAAAAAAAA 
GAAGTAGACC 
AATTATTTAT 



TTCCCACTCC 
GCCGGGCGCA 
CACGAGGTCA 
AATACAAAGC 



GTCATACAGC 



AATTTTATTT TCTTCTAGCC 
ATGTTTTAAT TCAGTATATC 
ATGTGATATT TTACATTCTT 
TTGATAGCAC ATCTCACTCT 
TAGTGGCTAC TGCACTGGAC 
GATTAGAATC CCAGAATCAG 
CTTTTAAAAA TGAGGACGCT GAGGCACAGA 
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TGTAGTGGTA 



CTTGGAATCA 
CAAAGACAAA 
AAGTGTTTAT 
GATCTCCGTA 
TGTTTAGTAG 
CATCACTGAG 
GGCCGACATC 
CCACTTTTCT 
GCCTATGAGA 
TGTTTTATGA 
GGCTTAGAGA 
CAGCACGGAA 
TATTCTTGAG 



GAATGGCGTG 
ACTGCACCCC AGCCTGGGCG 
AAGAAAAGAA AAGAAAAGTC 
AAAGTTTATA CCATAAGGAT 
TGATCCTTGA ATCTGTAAGA 
TTTGAAAATA ATAAACTTTA 
AGTGAAATGA GTTATATGGT 
CAGGAAAGTA 
TTACAGTGTC 
AAATCTGGGG 
TGTCCAGTTT 
CCCTGGAATT 
TATGATTAGC 
GTTTTTGTAA 
GGCATTTATG 

GCTTTCCAGG TGTAGTGCCA 
CATGCTTTCT GAACTCACTT 
TCTTTTATAC 
TTTAAATTTT 
TAGTTTTTTT 



ACCATCCTGG 
AGGTGTGGTG 
AACCCAGGAG 
ACAGAGCGAG 
TAGAGAACAT 

TCAAATAACA 
AAATATCAGA 



CTCAACAGGG 
ATTCTTAGGG 
CAGCACTTTG 
CCAACATGGT 
GCGGGCGCCT 
GCAGAGATGG 



TTAAAAAAAG 
GAAACCCCGT 



ACGACAGAAA 
CACAGGTACA 
CAGATCATGC 
GGAGACTCAT 
TAGAAGCCAT 



ACTTTTAGTA 
AGGATGTCTT 
AGAAATAGCC 
TGCCAAGAGG 
ACTTGCCCAG 
AAAGAAAAGC 



TATATTA=lGT 
AGTCTCTATC 
TTTAGAAATC 
TGTTGTCATA 



AAAAAAAAAA 
GGTTATTATT 
TTTGAAGAAC 



CAGGATGATA 
ATTTAAAAGA 



AGCAGATCTT TTTITTCCAA 
TCAGCCTGTC 
TCCTGACACA 
ACCTGGAAAG 
GGTGTATGTC 
TCTTTTCCAG 
AATGAAAAGA 
ATAAACTTCC 



CTACAATAAG T 
TCTAGGAAGA T 
ATAAAAACTG A 



5700 
5820 



CCACTAGGTG 



TGGTGTGAGT 
TAAGAACTTT 
AAAACCTGCC 
ACTTCTCATA 
TTCACTTGTG 
GTGAAATTAT 
TCCAAAAAGT 



S540 
6600 
SS60 



MAAEEEAAAG 
QVEDGEHQVK 
QHLRSLKGLT 
LNYDQASLTG 
. YSGSRRSFFC 



GKVLREENQC 
MKAFREAHSQ 
NSYVGSNYRP 
QSLFDFLHPK 



LPQELLGTSC 
TNPWTKELEY 
SIGTDIANEI 
TRQNQSTVAV 



CIiVAIGRLQP 
YEYFHQDDHN 
IVSVHTLVLG 
LDLQRLQSSS 
HSHEPLLSDG 



I 

" IAPWSSRVS 
TEKRRRDKMN 
SFLQDNELRH 
DVAKVKEOLS 
EEHGCLPNSK 
YIVPQNSGEI 
NLTDKHKAVL 
HSEPGEASFL 
YLDDSSPTGL 
AQLDFDALCD 



SFDISPREXL 
KKEIIRKFYTI 
NVKPTEFITR 
QSKEXILTDS 



NDDTAMAAFM 



I 

; FSSHM7EFPR 
I PQCNPMARKL 
FWGCERGKI 
IDAKTGLQVH 
IICTGYLRSWP 
FAVNGKFVYV 
YKFRAKDGSF 
SRQSCMSVPG 
SMSNKELFPP 
NYLEAEGGLG 



XRKGSDSDPS 
DKITVLRMAV 
LFVSXSVSXI 
SNLHAGRTRV 



DQRATAILGY 
VTLKSQWFSF 
MSTGTVLGAG 
SPSEMGELEA 

DFGI SDIQW 



GGTTACTCAT 
GACGCCAAGG 
GATCTGGACT 



GAGCAGGACG 
GCAGGCTGGC 
TGCGTGCAGA 



CGGACAATTC 
CGGCCTGGAT 
CTGCAACGCC 
ATACCCGCCC 
GGGTACATCG 
CTTCGACGGC 
CTGTGTCCAG 

CCCTCGAATC C 



TCGCTGGCAG 
CTTCACGGGC 
AAGCTCAACC 



GGTAAGAGGG 
GAGCCATGGA 
TGCTGCTGCT 
AAGCAGATGA 
ACG1CTGCAC 



I 



I 



TTCTGGCGTT 
TCACCTCGCG 
AGTGCTACAG 
TGAGCTGCTA 
TGACGGCAGC 
GCACTCGGGA 
CCCGCTGTAA 
TCCGGCTGCC 



CATCCAGCTG 



CTGTGTGGGC 
CAACGCCAGC 
TAATGTGACT 
TGGAGTAACA 
CTCTGACCTC 



GAGGCGGCAC ACCCAGGGGG 
AAAGCAGGTG CCCAGGCCAT 
GGAGGAGCGC AGGCCCTGGA 
CCGAACAAGA TGAAGACAGT 
GGGGCGGTGG AGACCATCCA 
CTCCCCGGCA AGAATGACCG 
CAGCAATGCG CTCAGGATCG 
CCGGCAGGTA ATGAGAGTGC 



r ACAAGGGCTG 600 



3 TTGACTGGAG 



CTCCGAGACA 



AGTGAGACCC 
GGGAGTAGAA 
CCACCAGGAC 



CGCAACAAGA 
CCCACGACTG 
ACATCCACCA 
CACGAGGCCT 
CGCAGCAATT 



TCACGCTCAG 
CCTACTTCTC 
TGGCCTCAAC 
CCAAACCCAT 



ATTGGCAGCC C 



CCCACCACTG 



GGGTGTTCTA 
AGGATGCTAA 



GCTGGTTTGC 
GCTTTTTGAG 
ATGTTAGGAC 
GCTTCCTACT 
TGGCTCCCCA 
CATATGTCTT 
TGTGTGATCA 



GCCCAGCCCC 
GGCTTTGGGA 
GACAGCTCCT 
AGAGTGAGAG 
CACTTTCTCC 
CTCTAAGCAC 
CCTTACTAGA 
GTTTCTGGCA 



TGGTGTCCTA 
TGGGTACCCC 

AATAAAATAC 
GTATCCTTCT 
AAGTCAGCTG 
TAGCCAGCCI 



CTGTGAGCTT 
TCTTCTCATC 
ACATTCCCCA 



TCAOGGGGAA G 
GGACTTTGGA GCGTGGGGTG 
ACTCCCCGCA TCTTTGGGGA 
CTCGAGGGCA GGGACCGTGC 
TCAATAAAGA TTTAATTACT 
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I 



I 



I 



MDPARKAGAQ AMIWTAGWLL LLLLRGGAQA LECYSCVQKA DDGCSPNKMK TVKCAPGVDV 

CTEAVGAVET IHGQFSLAVX GCGSGLPGKU DRGLDLHGLL AFIQLQQCAQ DRCNAKLNLT 

SRALDPAGNE SAYPPNGVEC YSCVGLSREA CQGT3EP > DHVYf FDGI I 

V RGCVQDEFCT RDGVTGPGFT LSGSCCQG3R CNSDLRNKTY FSPRIPPLVR 

5VTTSTS APVRPTSTTK PMPAPTSQTP RQGVEHEASR DEEPRLTGGA 

AGHQDRSNSG QYPAKGGPQQ PHNKGCVAPT AGLAALLLAV AAGVLL 



CACCAGTTTC 
CCCGGGCGTC 
CCTCCATGTT 



GCGAATCTCA 
ACCCTGGCGC 
GGGCTTAGCG 
CTACAGGGGC 



I 



I 



CCATCAAAGT 
CATGCCCACT 
TGATCCGCCT 
CTTTGCCCGC 
CAAGCCGCTG 
TTGTCCATCG 
AACTCATTGA 
GGACAAGGGT 
CCACTGTCTG 
AGAGGGACCA 



AATCGTGTGC 
CTGCTATGGA 
TTTGAGACAC 
TTTGACTATA 



ACGCTGCGCC GTCTGCGGGC GCTTCCGGGC 
CCCCCAGCCC TGGCTCCCCA GC7GCGCTGC 
GGTTCAGTGG GCTCAATCTG CGCAGCGCCA 
CTCCCGCGCC CCCCGGGACC CCCACGCCGC 
AGGCCGAGTA TCGACTCGGC CCCCTCCTGG 
GACACCGCCT CACAGATCGA CTCCAGGTGG 
TGGGCTGGTC CCCCTTGTCA 3ACTCAGTCA 



I CATGCTGGTC CTCGAGCGGC 540 



I CAGCCATCCA G 



GTCACTGGGC 
GGAGATTCTG 
AATCCGCCGG 



A TCCTGATAGA 
C TTCATGATGA 
A TCTCTCGACA 



C TCCCGTGGAG 660 



AAAGGAGGCC 
TGGCCCCCAA 
GTTGACTTGG 

CAAAGGAGCC 



CTGCCCCTTT 



ATG CAAACAC 
AGCCATCCCA 



TTATTTTGAT 
TCATATGCTT 
AGTAAAGGGA 
TCAGCCCAGG 



CACCACCAGA 



CTGAGCCGGG 
TCCAAGTGTG 
CCACTATTTA 



GATGTGTCAC 
TTACTTGGGC 
CCCTTTCCCC 
ATTTTTTATT 
TTTTTTTTTG 
GAACCTTAAT 
CAATAGGATG 
TGTTTTCCTG 
ATTGTCCAAT 
CCCTCCTTTT 
ATAAAAGTAA 



AACCTGTGGT 
TATTTTGGTG 
CCCACATTGG 



TTGCTACCCT 
TGGCCATGTC 
CATTAAAGTC 
C _,G.1T3CC 
CCCTGA7TTT 
AAGTTGTTCC 
CACCTCCTAC 
TCCTTCCAAT 



r GCACTCCCGG 84 0 

ATTCCCTTTG 
C TCCCCAGACT 
C TCACTGGAAG 
r CAACCCCTCC 



TTGGGGGAGG 
TCCATAATTT 



GGGCGCTCCC 
TACTAAAATG 
TTTTCCTGCC 



TAATGCCCTG 
CCCTACTTTG 
GGGAAGGAAT 

TCCAATTTTG 
TAAATAATCA 
TGGATTAT1T 
AAAAAAAAAA 



CCTACGCCGT Gl 
ACCCTACACT GACTTTGATG 780 
CCAGTACCAT O 
GTGTGGGGAC A' 
AGCCCATGTC I' 
TTCCCGACCC T 
TGTTACCCCT C 
AAGCCTGGCC Ti 
ACAGGGATAG A 
CAGTATTACT A 
CAGT7CCCTT C 
GGAGGGGGAA C 

TACCACCACA C 
ACCCCAGTAG C 
GGTCAAGCTG CTTACCTGCC 
TTGTTACCCC AAGGCTTCTT 
TTATCCCAAG TGCTCTTATT 
GGAAGATGGA CACCACCGGA 
GATGGGCTAG GGGAAATAAG 
CAGA7TTTTG CAACCTCCTC 
CGTATTGTGG GGAGGGGAGT 
AAAAAGCCAT GTGTGGAAAC 
AAAAAAAA 



MLTKPLQGPP APPGTPTPPP GGKDREAFEA EYRLGPLLGK GGFGTVFAGH RLTDRLQVAI 
KVIPRNRVLG WSPLSDSVTC PLEVALLWKV GAGGGHPGVI RLLDWFETQE GFMLVLERPL 
PAQDLFDYIT EKGPLGEGPS RCFFGQWAA IQHCHSRGW HRDIKDENIL IDLRRGCAKL 
IDFGSGALLH DEPYTDFDGT RVYSPPEWIS RHQYHALPAT VWSLGILLYD KVCGDIPFER 



CGGGCACAGG 
CCCTGGGGCC 
GTCAGCGACA 
CGTTGGGGAG 
AGCCTGCAAG 
CTGTAAGCCG 
GCACCACTGG 
CCAGCAGAAG 




I 



I 



TGAGCCCCGG CCGCCGGCCC G3CATGGGCG TCTCCCGC3G 

AGGGCCGGAT GGAGCCGCGG GACGGTAGCC CCGAGGCCCG 

CTTCCGCCTC GTCCAGCGGC TCCGAGCGCG ACGCCGGTCC 

GGCG ACT CAA CAAGCGGCGC T 

CCAAGTCGGG CCTCCAGCAC CTGGCCCCCC C 

AGTCAGAGCG GCAGATCCGG AGTACAGTGG ACTGGAGCGA 

ACATCTGGTT CGAGACCAAC G 
TAGCCAGGAT GCTGAAG 

ACACGCCCTG CATCGAGCAG CTG 1 

AATCAGGCTC CAGGAATGTC CGCGAGCCAA CCTTTGTACG 
GACGCCAGGA CGGCAAG 
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GCTGGGGGTC CACGCAGCCG 



CCAGAATACT 
GCTCATGAAG 



TGTCCTGCTT CATGCTGCAG CAGATCGAGG AGCCGTGCTC 
TGGTCATCCC GCCCACCTGG A7CCTCCGCG CCCGGAGGCC 
GCAAGAAGAA GAAGAGGGCA TCCTTCAAGA GGAAGTCCAG 
GCCGCTGGAG ACCCTTCATC ATCAGGCCCA CCCCCTCCCC 
CCCCAAGAGT GGGGGCAACC AGGGTGCAAA 
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CAGGTACTGT GCGGGCACCA 
CCAGCGGCAT GACGACGGCT 
GCTGCAGGTG GGCGGACACG 
AT CCAAGGCC ATCCCGGTGC 
CCGCATCGCC CTGCGCAACC 
CCCCCTGCAC AGCGAC CAG C 
CAGCATGCAC GACTATGAGG 
GCCGCTGGGC ACTGTGGTGG 



GTCCCCCAAG TGGTGCTTCC 
AGCCCAGGAG CACCTCAACT 
CCCTGAGCTG CTGGGGGCAT 
CACCTCACCC TGCTCACCCA 



AGGTGGATGG 
AGGCCACCAT 
AGCCGGTGCC 
CCCTGCACTA 
TCCCAGGAGA 
CCGATGGTGC 
TGGACGCCAC 



AGCTGGGGGC GACCTCATGC 
CAGCACTGGC AGCAAGGATG 
TGATGCGGTG GAGGAAAACG 
CACCATCTGC CACTACATCG 
CGACACTCCC CGGCAGCGGG 
GAACCGGCAG CACTACCAGA 

seq ID NO! 172 Proteii 



MEPRDGSPEA 
GLQHLAPPPP 
MLKSVSRRKC 



CGAGCCCTGC 
GGTGCAGAAG 
AGAGCAGTTG 
CGACAAGGAG 
CAGTGACCTA 
TGGAGCCAAG 
CACTGCCAGC 
GATCGCACAG 
TGACCTCCCA 
ACTGCAAGGG 
CGACTTCTGT 
GCAGAGTCGC 
CCTGCTGGAC 
TTTGCACCAA 
GGCCTCGCTC 
TCAGGACACC 



GAGCACCACG ACTTTGAGCC 
ACCATGACGT CGTTGGCCGC 
CGCGAGGTGG TGCTCACCAC 
AAGCTTGCAG CCTCACGCAT 
GCCAAGCGGC GGAGCGCCGC 
CGCATCCAGG TGAGTCGCGT 



GAGCTCTGCC GTGCCCACAT 2160 - 



CGCTTCTACA 
GATGAGATTT 
ACCCCCACTT 
GATGCTGCAC 
AAGCTCCAGG 
ACGCTCCTGC 



GCCAGAAACT 



ATATCCTGGA 
CCCCTCTCCC 
CCCCTCAAGG 
AGCTGCACCG 
ACCACGCAGT 
CAGAGATCCT 



ATGAAGACAG ACCAGCAGGG 
GAGCTGGCCG CCTACCTGGA 
GAGACGGCIG TGTAGCGGGC 



PPTWILRARR 
NPKSGGNQGA 
LSTLDQLRLK 
LHAEPNPEAG 



AACKIWHTP 
FQQKFTFIISK 
PQNTLKASKK 
KIIQSFLWYL 
PPPPVAILPL 
PEDRDEGATD 



RQIRSTVDWS 
CIEQLEKINF 
EIVAISCSWC 



GIIPGEHHDFE 
GEPCKLAASR 
YDKEQLKEAS 
TTASRFYRID 
SLQGDAAPPQ 
YLLDHAPPEI 
AQDTELAAYL 



PQRHDDGYLE 
IRIALRNQAT 
VPLGTVWPG 
RAQEHLNYVT 
GEELIEAAKR 



ENROHYOMIQ 



NPRQVFDLSQ 
GTGNDLARTL 
RLPLDVFNNY 
AKHIRWCDG 
VIGFTMTSLA 
MVQKAKRRSA 
DSDLELCRAH 
EIAQDEIYIL 
NDFCKLQELH 
CLHQAAALGQ 
REDQETAV 



NWCCGYTDEP 



ALQVGGKGER 
APLHSDQQPV 
IERLQQEPDG 
DPELLGASAR 
RAGGDLMHRD 
RTICHYIVEA 



LEFHESREAN 
KPQCWFLKI 
LTQCREWLT 
PEQLRIQVSR 
/ 3 i SPTCQf 
PDLPTPTSPL 
EQSRTLLIIHA 



Seq ID NO: 173 DNA sequen 
Nucleic Acid Accession #: 
Coding sequence: 1-1662 



L NKRRFPGLRL FGKRKAITKS 



SRNVREPTFV RHKWVHRRRQ 
FMLQQIEEPC SLGVHAAWI 
RPFIIRPTPS PLMKPLLVFV 

RKVHNLRILA CGGDGTVGW I 



PEKFNSRFRN 
PRYCAGTMPW 
TSKAIPVQVD 
VSMHDYEALH 
LSPKWCFLDA 



ATGCCGGTGC 



CACTACCTGT 
CTTTTTGCCT 
TCCCCGCGGC 
TTGCGCAAGT 
GTGGTGGATG 



I 

" AGCTGACGAC 
GCATCCTGGC 
CCTTCGGCCT 
TCCTGGAGCA 
GGGGCTCGGT 
GCCTGCGCTC 
GCAACCGCCA 



AGCCCTGCGT 
AGCCTATGTG 
GTACGGCGCC 



GGCACTGTGC 



ATTGCCGCAT 
ATCTCCTTCC 
TACATGCTGG 



CCAGCCTGTT 
AGTTCATCCA 
TGCACCTGCT 
GCCAGGCCCT 

CTGACCTCAA 
ACATCTTCCA 



TGCCCTGGCA 60 

CACGGAAAAG 120 

CATTCAGAGC 180 

GAAGCTGCCC 240 

CCCTGACTAC 300 

GGTGGTCATG 360 

CGAGGTGCTG 420 



TGGCGCAGCA ACTTCCATGA GGCAGGCGAG 480 
T GGTGOGGGCC 540 



I GTACACGGCC 



GCACCATCGA 
ATGTCCAGAT 
GGATGGCCTT 
GTGGGCCCTT 
ATCAGAAGTT 



TACTTCCGGG 



CCACTAAGTA 
AGTGGCTCTA 
TGGTCACGGG 
GCCGCATCTG 
CCACCTACGC 



GGAGGACGCC 
CTTCTTTGTG 
GCAGGAGGGC 
GCAGAAGTGG 



GATGCTTCGA GTCCTGGAGG AGGATCCCCA AGTAGGGGGA 
CCTCAACAAG TACGACTCAT GGATTTCCTT CCTGAGCAGC 
CAACGTGGAG CGGGCCTGCC AGTCCTACTT TGGCTGTGTG 
GGGCATGTAC CGCAACAGCC TCCTCCAGCA GTTCCTGGAG 
CCTAGGCAGC AAGTGCAGCT TCGGGGATGA CCGGCACCTC 
TGGCTACCGA ACTAAGTATA CXMCGCX 



CAACTCTCTG TGGTTCCATA AGCACCACCT CTGGATGACC 
TTTCTTCCCC TTCTTCCTCA T7GCCACGGT TATACAGCTT 
GAACATTCTC CTCTTCCTGC TGACGGTGCA GCTGGTGGGC 
CTGCTTCCTT CGGGGCAATG CAGAGATGAT CTTCATGTCC 
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CTCTACTCCC TCCTCTATAT GTCCAGCCTT CTGCCGGCCA AGATCTTTGC CATTGCTACC 1380 

ATCAACAAAT CTGGCTGGGG CACCTCTGGC CGAAAAACCA TTGTGGTGAA CTTCATTGGC 1440 

CTCATTCCTG TGTCCATCTG GGTGGCAGTT CTCCTGGAGG GGCTGGCCTA CACAGCTTAT 1500 

TGCCAGGACC TGTTCAGTGA GACAGAGCTA GCCTTCCTTG TCTCTGGGGC TATACTGTAT 1560 

GGCTGCTACT GGGTGGCCCT CCTCATGCTA TATCTGGCCA TCATCGCCCG GCGATGTGGG - 1620 

AAGAAGCCGG AGCAGTACAG CTTGGCTTTT GCTGAGGTGT GACATGGCCC CCAAGCAGAG 1680 

CGGGTAAAGT GCAATGGGTA AGGGAGGGAA GGGGAATGGA AGAGAAAAGA CAGGGTGGGA 1740 

GGGAGGAGGG AGTGCTGTGT TTTAGTCTCT TAATGGTCCA AAGGACAAAT CTAAAATGCA 1800 

AAGAACGGTG ATGTAGTATG GCCTGACAGC TCTGTTTAGA GGAGGCAACA CTGATCCCCC 1860 

AGATGCAGGG CTGCAGGGGA TTCTGTGTTT TCAGACTGCC TGTCTGCnG CATCTGCACA 1920 

TAGGCAGTAG CCTCCTCCTG GGCTCCAGAG GGCACTCAGA AGTTGTGCTA AACCAAGTTA 1980 

AGTCCCATTC AGTGGCAACT TGTGATAGGT ACCTGAGTGA CGGCAACCTG CGGAAGGAGG 2 040 

TTCTCCCAGC CCATCTGAAC ACAACCAGAG GTGGCAGGAG AATTTCTACT GAGCGAGGTG 2100 

GGCCGGTTAG TGTATGTCAC CCCCACCCCA CCCATAAGTA GTCATCAATG CAATAAGATT 2160 

GCGCGTGAGA TACAAGGCCC AGAAGCCTGA TCTTTGGGCA TCAGAAAACA GGGTCCAGGA 2220 

ATGGTGCTTT ATGTGAGATA CCCCACTCCA CATCAACATT CCAGGGATGA GCCAAACCAG 2280 

CAGGGAGTTA GCACTGAACT GCTTTTAAAA GTGCACATTA AAAAGGAAAG TTTGCCAGGA 2340 

GGAACAAAGA GATTGTGGTG GTGCTAAAGG AGGCCATAAG CTACACAGAG GCCTTGGGTG 2400 

TTCCACCTGG AAACTGCTCA GACGTCTAGA TGGGTTCTTA GCTTGTCTG? GATCTCTGCT 2460 

GGGGAGATAA AAAG AT TAAG CCCCAACATG TTCAGAAAAG AAGTGAAGTC TTGGGTATTT 2S20 

TAACCTGTAT ACTCTTGAAT TCCTCTCAAA TTCAGCTCTG ATCTGAGGCT AAGACACACT 2 580 

CCCCACTTCA CTTTCTTCAA AGCCACATTT TTTGAGGTAT CACTGCAGTC ACCTCTTCTA 2 640 

CCCTCATCAT CATAGGTAAG GTTTTCAAGG TGGCAATTGG GGCGGAGCCC CGGCTTCTTA 2700 

TAGAAGCTTC AGCAGGAGGC AAGCGTGTTC TCAGCACATA TGGGAACTA7 GAGGAGCCTC 2760 

TGATCAAATT GGCTACAATC TTGGAGCTGC TTGGACGGAT TCCTTGGCAG CCGGGTTAGC 2 820 

ATGTGTGACT TTCAGGCTAC TGTTCTTGAC AATCATCTCC AATGGAAAGC TTTTCAGTGT 2B80 

TCCCAAAGTG AACTCTCAAA TCCAAAATGG TTATCTTTGA GACCATCCAT TCTCCTCAGT 294 0 

GGCTTCTCCA GGGAATTCTT ACAGCCAAGT TGTGACAGTC ACTGCATTTG CCTGCTTCTT 3000 

TCCAGAAACC AAACTAGGAG ATGAAACTGG TTCCTACATC CTAAGGTTCT TGCTITCTCT 3060 

CTCATGCCTC CTGAGGCTGT TTTTGGCTGT TTTCCCTCTG CTGCTTTTGG GGAATGAGGG 312 0 

GAAGCCATTT TCCAAGTGAC TTGCAATCCA GGCTGTTCTC AGCGTTTTGA GTTTAAAACC 3180 
TGGGATCCTG ACTAAGCCTT TGACTTAAGG GTTGCTTG 



50 



GGGCAAACCC TGGTGCTTTC CTTCATCTCC CACGAACTCA AGGGTTTTCC AAGTGTAGCT 3420 

AACAGTTGCC ACATCACACA GACCTCCAGT TTCTGGTAAG ACTGCTGGT? GACATCAGAC 3480 

T GAAGGCTGGA AGGCAGCAGG CATTTGCTAA GGCAGCTGAT CCAGGCAATC 3540 

3 CCAAGAAGTT AAACTATTTT GAGCATTAGA ATGGAGGAAA TCCGGTCAGC 3600 

CAAGTGCAGA GTTCAGACTT CGCTAAGGGC TTGTTTTTCT TCAGCATTTA CTTGAAGATT 3660 

AATGTAGGAT GACAGGCTCT CCTGGCTGTC CTACCATCAG CTCTGCCTTG CACTGTGGTC 3720 

GTCAACTTTC CTCAAATCAA AAACAGGCAG GTACAGGTAG TGGGCTCACA ACGTTTGACC 3780 

TCGACTGGTT TTTCTAAGTT ATTTTGTACA TTTTTCAGCA GCAAAACCAA ACTGGGTCTT 3840 

CAGCTTTATC CCCGTTTCTT GCAAGGGAAG AGCCTTTATA CAATTGGACG CATTTTGGTT 3 90 0 

:tcittt GTATTGTTTC TACAATAATT TGTAAACATA 3960 

?TTTTTT TAATTTTCAG GTCAAGTTTT TTATACTGCA 4020 
■AAAGA TTCTCACAT 

Seg ID NO: 174 Pr 



I I I I I I 

MPVQLTTALR WGTSLFALA VLGGILAAYV TGYQFIHTEK HYLSFGLYGA I 

LFAFLEHRRM RRAGQALKLP SPRRGSVALC IAAYQEDPDY LRKCLRSAQR I 

55 WDGNRQEDA YMLDIFHEVL GGTEQAGFFV WRSNFHEAGE GETEASLQEG M 

STFECIMQKW GGKREVMYTA FKALGDSVDY IQVCDSDTVL DPACTIEMLR VLEEDPQVGG 

VGGDVQILNK YDSWISFLSS VRYWMAFNVE RACQSYFGCV QCISGPLGMY RNSLLQQFLE 

DWYHQKFLGS KCSFGDDRHL TNRVLSLGYR TKYTARSKCL TETPTKYLRW LNQQTRWSKS 

YFREWLYNSL WFHKHHLWMT YESWTGFFP FFLIATVIQL FYRGRIViNIL LFLLTVQLVG 

60 IIKATYACFL RGNAEM I FMS LYSLLYMSSL LPAKIFAIAT INKSGWGTSG RKTIWTJFIG 

LIPVSIWVAV LLEGLAYTAY CQDLFSETEL AFLVSGAILY GCYWVALLML YLAIIARRCG 
KKPEQYSLAF AEV 

Seg ID NO: 175 DNA sequence 
65 Nucleic Acid Accession tt: NM_000691 
Coding sequence: 43.. 1404 



TTCCAGCAGC TGGAGGCGCT GCAGCGCCTG ATCCAGGAGC AGGAGCAGGA GCTGGTGGGC 
GCGCTGGCCG CAGACCTGCA CAAGAATGAA TGGAACGCCT ACTATGAGGA GGTGGTGTAC 
GTCCTAGAGG AGATCGAGTA CATGATCCAG AAGCTCCCTG AGTGGGCCGC GGATGAGCCC 
GTGGAGAAGA CGCCCCAGAC TCAGCAGGAC GAGCTCTACA TCCACTCGGA GCCACTGGGC 
GTGGTCCTCG TCATTGGCAC CTGGAACTAC CCCTTCAACC TCACCATCCA GCCCATGGTG 
GGCGCCATCG CTGCAGGGAA CGCAGTGGTC CTCAAGCCCT CGGAGCTGAG TGAGAACATG 
GCGAGCCTGC TGGCTACCAT CATCCCCCAG TACCTGGACA AGGATCTGTA CCCAGTAATC 
AATGGGGGTG TCCCTGAGAC CACGGAGCTG CTCAAGGAGA GGTTCGACCA TATCCTGTAC 
ACGGGCAGCA CGGGGGTGGG GAAGATCATC ATGACGGCTG CTGCCAAGCA CCTGACCCC? 
GTCACGCTGG AGCTGGGAGG GAAGAGTCCC TGCTACGTGG ACAAGAACTG TGACCTGGAC 
GTGGCCTGCC GACGCATCGC CTGGGGGAAA TTCATGAACA GTGGCCAGAC CTGCC r 
CCAGACTACA TCCTCTGTGA CCCCTCGATC CAGAACCAAA TTGTGGAGAA GCTCAAGAAG 
TCACTGAAAG AGTTCTACGG GGAAGATGCT AAGAAATCCC GGGACTATGG AAGAATCATT 
AGTGCCCGGC ACTTCCAGAG GGTGATGGGC CTGATTGAGG GCCAGAAGGI GGCTTATGGG 
GGCACCGGGG ATGCCGCCAC TCGCTACATA GCCCCCACCA TCCTCACGGA CGTGGACCCC 
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GC TGCCCATCGT GTGCGTGCGC 1080 

G AGGCCATCCA GTTCATCAAC CAGCGTGAGA AGCCCCTGGC CCTCTACATG 1140 

TTCTCCAGCA ACGACAAGGT GATTAAGAAG ATGATTGCAG AGACA^CCAG TGGTGGGGTG 1200 

GCGGCCAACG ATGTCATCGT CCACATCACC TTGCACTCTC TGCCCTTCGG GGGCGTGGGG 1260 

AACAGCGGCA TGGGATCCTA CCATGGCAAG AAGAGCTTCG AGACTTTCTC TCACCGCCGC 132 0 

TCTTGCCTGG TGAGGCCTCT GATGAATGAT GAAGGCCTGA AGGTCAGATA CCCCCCGAGC 1380 

CCGGCCAAGA TGACCCAGCA CTGAGGAGGG GTTGCTCCGC CTGGCCTGGC CATACTGTGT 144 0 

CCCATCGGAG TGCGGACCAC CCTCACTGGC TCTCCTGGCC CTGGAGAATC GCTCCTGCAG 1500 

CCCCAGCCCA GCCCCACTCC TCTGCTGACC TGCTGACCTG TGCACACCCC ACTCCCACAT 1560 

GGGCCCAGGC CTCACCATTC CAAGTCTCCA CCCCTTTCTA GACCAATAAA GAGACAAATA 162 0 

Seg ID NO : 176 Protein sequence: 
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MSKISEAVKR ARAAFSSGRT RPLQFRFQQL EALQRLIQEQ EQELVGALAA DLHKNEWNAY 60 

YEEWYVLEE IEYMIQKLPE WAADEPVEKT PQTQQDELYI HSEPLGWLV IGTWNYPFNL 12 0 

TIQPMVGAIA AGNAWLKPS ELSENMASLL ATIIPQYLDK DLYPVINGGV PETTELLKER 180 

FDHILYTGST GVGKIIMTAA AKHLTPVTLE LGGKSPCYVD KNCDLDVACR RIAWGKPMNS 240 

GQTCVAPDYI LCDPSIQNQI VEKLKKSLKE FYGEDAKKSR DYGRIISARH FQRVMGLIEG 300 

QKVAYGGTGD AATRYIAPTI LTDVDPQSPV MQEEIFGPVL PIVCVRSLEE AIQFINQREK 360 

PLALYMFSSN DKVIKKMIAE TSSGGVAAND VIVHITLHSL PFGGVGNSGM GSYHGKKSFE 420 
TFSHRRSCLV RPLMNDEGLK VRYPPSPAKM TQH 

Seq ID NO: 177 DNA sequence 

Nucleic Acid Accession #s NM_001067.1 

Coding sequence: 108-4703 

1 11 21 31 41 51 
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CTAACCGACG CGCGTCTGTG GAGAAGCGGC TTGGTCGGGG GTGGTCTCGT GGGGTCCTGC 60 

CTGTTTAGTC GCTTTCAGGG TTCTTGAGCC CCTTCACGAC CGTCACCATG GAAGTGTCAC 12 0 

CATTGCAGCC TGTAAATGAA AATATGCAAG TCAACAAAAT AAAGAAAAAT GAAGATGCTA 180 

AGAAAAGACT GTCTGTTGAA AGAATCTATC AAAAGAAAAC ACAATTGGAA CATATTTTGC 24 0 

TCCGCCCAGA CACCTACATT GGTTCTGTGG AATTAGTGAC CCAGCAAATG TGGGTTTACG 300 

ATGAAGATGT TGGCATTAAC TATAGGGAAG TCACTTTTGT TCCTGGTTTG TACAAAATCT 360 

TTGATGAGAT TCTAGTTAAT GCTGCGGACA ACAAACAAAG GGACCCAAAA ATGTCTTGTA 42 0 

TTAGAGTCAC AATTGATCCG GAAAACAATT TAATTAGTAT ATGGAATAAT GGAAAAGGTA 4 80 

TTCCTGTTGT TGAACACAAA GTTGAAAAGA TGTATGTCCC AGCTCTCATA TTTGGACAGC 54 0 

TCCTAACTTC TAGTAACTAT GATGATGATG AAAAGAAAGT GACAGGTGGT CGAAATGGCT 600 

ATGGAGCCAA ATTGTGTAAC ATATTCAGTA CCAAATTTAC TGTGGAAACA GCCAGTAGAG 660 

AATACAAGAA AATGTTCAAA CAGACATGGA TGGATAATAT GGGAAGAGCT GGTGAGATGG 72 0 

AACTCAAGCC CTTCAATGGA GAAGATTATA CATGTATCAC CTTTCAGCCT GATTTGTCTA 780 

AGTTTAAAAT GCAAAGCCTG GACAAAGATA TTGTTGCACT AATGGTCAGA AGAGCATATG 84 0 

ATATTGCTGG ATCCACCAAA GATGTCAAAG TCTTTCTTAA TGGAAATAAA CTGCCAGTAR 900 

AAGGATTTCG TAGTTATGTG GACATGTATT TGAAGGACAA GTTGGATGAA ACTGGTAACT 960 

CCTTGAAAGT AATACATGAA CAAGTAAACC ACAGGTGGGA AGTGTGTTTA ACTATGAGTG 102 0 

AAAAAGGCTT TCAGCAAATT AGCTTTGTCA ACAGCATTGC TACATCCAAG GGTGGCAGAC 10 80 

ATGTTGATTA TGTAGCTGAT CAGATTGTGA CTAAACTTGT TGATGTTGTG AAGAAGAAGA 1140 

ACAAGGGTGG TGTTGCAGTA AAAGCACATC AGGTGAAAAA TCACATGTGG ATTTTTGTAA 12 0 0 

ATGCCTTAAT TGAAAACCCA ACCTTTGACT CTCAGACAAA AGAAAACATG ACTTTACAAC 12 60 

CCAAGAGCTT TGGATCAACA TGCCAATTGA GTGAAAAATT TATCAAAGCT GCCATTGGCT 13 2 0 

GTGGTATTGT AGAAAGCATA CTAAACTGGG TGAAGTTTAA GGCCCAAGTC CAGTTAAACA 13 80 

AGAAGTGTTC AGCTGTAAAA CATAATAGAA TCAAGGGAAT TCCCAAACTC GATGATGCCA 1440 

ATGATGCAGG GGGCCGAAAC TCCACTGAGT GTACGCTIAT CCTGACTGAG GGAGATTCAG 15 00 

CCAAAACTTT GGCTGTTTCA GGCCTTGGTG TGGTTGGGAG AGACAAA7AT GGGGTTTTCC 1560 

CTCTTAGAGG AAAAATACTC AATGTTCGAG AAGCTTCTCA TAAGCAGATC ATGGAAAATG 1620 

CTGAGATTAA CAATATCATC AAGATTGTGG GTCTTCAGTA CAAGAAAAAC TATGAAGATG 1680 

AAGATTCATT GAAGACGCTT CGTTATGGGA AGATAATGAT TATGACAGAT CAGGACCAAG 1740 

ATGGTTCCCA CATCAAAGGC TTGCTGATTA ATTTTATCCA TCACAACTGG CCCTCTCTTC 1800 

TGCGACATCG TTTTCTGGAG GAATTTATCA CTCCCATTGT AAAGGTATCT AAAAACAAGC 1860 

AAGAAAIGGC ATTTTACAGC CTTCCTGAAT TTGAAGAGTG GAAGAGTTCT ACICCAAATC 192 0 

ATAAAAAATG GAARGTCAAA TATTACAAAG GTTTGGGCAC CAGCACATCA AAGGAAGCTA 1980 

AAGAATACTT TGCAGATATG AAAAGACATC GTATCCAGTT CAAATATTCT GGTCCTGAAG 2 040 

ATGATGCTGC TATCAGCCTG GCCTTTAGCA AAAAACAGAT AGATGATCGA AAGGAATGGT 2100 

TAACTAATTT CATGGAGGAT AGAAGACAAC GAAAGTTACT TGGGCTTCCT GAGGATTACT 2 ISO 

TGTATGGACA AACTACCACA TATCTGACAT ATAATGACTT CATCAACAAG GAACTTATCT 2220 

TGTTCTCAAA TTCTGATAAC GAGAGATCTA TCCCTTCTAT GGTGGATGGT TTGAAACCAG 2280 

GTCAGAGAAA GGTTTTGTTT ACTTGCTTCA AACGGAATGA CAAGCGAGAA GTAAAGGTTG 2 340 

CCCAATTAGC TGGATCAGTG GCTGAAATGT CTTCTTATCA TCATGGTGAG ATGTCACTAA 24 00 

TGATGACCAT TATCAATTTG GCTCAGAATT TTGTGGGTAG CAATAATCTA AACC7CTTGC 2460 

AGCCCATTGG TCAGTTTGGT ACCAGGCTAC ATGGTGGCAA GGATTCTGCT AGTCCACGAT 2520 

ACATCTTTAC AATGCTCAGC TCTTTGGCTC GATTGTTATT TCCACCAAAA GATGATCACA 2580 

CGTTGAAGTT TTTATATGAT GACAACCAGC GTGTTGAGCC TGAATGGTAC ATTCCTATTA 2 640 

TTCCCATGGT GCTGATAAAT GGTGCTGAAG GAATCGGTAC TGGGTGGTCC TGCAAAATCC 2 700 

CCAACTTTGA TGTGCGTGAA ATTGTAAATA ACATCAGGCG TTTGATGGAT GGAGAAGAAC 2 760 

CTTTGCCAAT GCTTCCAAGT TACAAGAACT TCAAGGGTAC TATTGAAGAA CTGGCTCCAA 2820 

ATCAATATGT GATTAGTGGT GAAGTAGCTA TTCTTAATTC TACAACCATT GAAATCTCAG 2880 

AGCTTCCCGT CAGAACATGG ACCCAGACAT ACAAAGAACA AGTTCTAGAA CCCATGTTGA 2 940 

ATGGCACCGA GAAGACACCT CCTCTCATAA CAGACTATAG GGAATACCAT ACAGATACCA 3 0 00 

CTGTGAAATT TGTTGTGAAG ATGACTGAAG AAAAACTGGC AGAGGCAGAG AGAGTTGGAC 3 060 

TACACAAAGT CTTCAAACTC CAAACTAGTC TCACATGCAA CTCTATGGTG CTTTTTGACC 3120 

ACGTAGGCTG TTTAAAGAAA TATGACACGG TGTTGGATAT TCTAAGAGAC TTTTTTGAAC 3180 

TCAGACTTAA ATATTATGGA TTAAGAAAAG AATGGCICCT AGGAATGCTT GGTGCTGAAT 3240 

CTGCTAAACT GAATAATCAG GCTCGCTTTA TCTTAGAGAA AATAGATGGC AAAATAATCA 3300 
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TTGAAAATAA GCCTAAGAAA GAATTAATTA AAGTTCTGAT TCAGAGGGGA TATGATTCGG 
ATCCTGTGAA GGCCTGGAAA GAAGCCCAGC AAAAGGTTCC AGATGAAGAA GAAAATGAAG 
AGAGTGACAA CGAAAAGGAA ACTGAAAAGA GTGACTCCGT AACAGATTCT GGACCAACCT 
TCAACTATCT TCTTGATATG CCCCTTTGGT ATTTAACCAA GGAAAAGAAA GATGAACTCT 
GCAGGCTAAG AAATGAAAAA GAACAAGAGC TGGACACATT AAAAAGAAAG AGTCCATCAG 
ATTTGTGGAA AGAAGACTTG GCTACATTTA TTGAAGAATT GGAGGCIGTT G 
AAAAACAAGA TGAACAAGTC GGACTTCCTG GGAAAGG 

CACAAATGGC TGAAGTTTTG CCTTCTCCGC GTGGTCAAAG AGTCATTCCA O 
TAGAAATGAA AGCAGAGGCA GAAAAGAAAA ATAAAAAGAA AATTAAGAAT GAAAATACTG 
AAGGAAGCCC TCAAGAAGAT GGTGTGGAAC TAGAAGGCCT A 

AACAGAAAAG AGAAC CAGGT ACAAAGACAA AGAAACAAAC TACATTGGCA TTTAS 
3 AAAGAAGAGA AATCCCTGGC CTGATTCAGA A 
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GATGCTAGTC 



AATTCACAAT GGATTTGGAT 
ATGAAGATTT 
GTAACAAAGA 
AGGGCAGTGT ACCACTGTCT TCAAGCCCTC 
TTACAAACCC 



CAGCTTTGAA 
AAAGGAAGCC 
CAGTCACAAG 
CTGTGGCTCC 
CAGATGAAGA 



CTGCTACACA 
GGGCTGCCCC 
ACTCTAATTT 



TGGGGAAGGT 
TTTTTATAAT 
TCTGTGTTTT 
TGTAGAAATA 
AGCTAAAACT 
AGATATGAGA 
GATAGAACTT 



AAAGAAAAAC 



AGAGCAGTTT 
TAGAG CAT AA 
TATGGTTCTA 
AGCAATGAGA 
ACTTTGGCTG 



CAAGACATCA 
TAGTGACCAT 
TCTTTTGTCT 
CACTTCAGCG 
GATTTAAAAG 
TTATCTGTTT 



CTCATGGGCA 
TTAAAACCTG 
TAAAGCAGTG 
TGTCACTCTT 
TATCTTAGTT 



ATCTCCCAAA 
CATTTGATCC 
TTCATTTTGG 
GTTATATGTG 
TCTATTAGCT 



AATTGCTCAT 
TGTCTATAAC 
CAGCTCTTGA 
AATTTCTAAG 
ATGTTATATT 



CAAAACCAAG A 
TGAGAAAATT GTTTCGAAAG 
CCATATGGAC TTTGACTCAG 
TATAAAGTAC CTGGAAGAGT 
ATCTTACCAA 

AAGCCCAAGT 1 
TTGTTTTCTT CTCTGCTTTG 
ATTTTTAAGT ICTTCTGAAC 
TGTTTATTAA CCATCCACTA 
CCTCCTTTTC TACTTTCAGT 
"TATACATAA TTTACCATCA 
TCAGCCTCTT ATGTGCCAAG 
TTCTCAAATC ATCAGAGGCC 



AGGACTGGAT 



TTGTAAACTT 
CAACGTTTTT 
TTTAATAAAA 



CTGGCTGCCT CTGAGTCTGA 
TGCAGAAGAC TCGGGGACAA 
CTCAGCAATG AGCTATTAGA 
TGTTAAGACC TGTCTACATT 
GTAAATATTT ACTATGTTTT 
TGTTCTAAAC ATTGC 



I 



I 

IGSVELVTOO 



NGKGIPWEII 



LTMSEKGFQQ 
VQLNKKCSAV 



ENMQVNKI KK NEDAKKRLSV ERIYQKKTQL E 

LYKIFDEILV NAADNKQRDP KMSCIR 
IFGQLLTSSM YDDDEKK 

AGEMELKPFN GEDYTCITFQ PDLSKFKMQS L 

KLPVKGFRSY VDMYLKDKLD ETGNSLKVIH EQVWHRWEVC 

ISFVNSIATS KGGRHVDYVA DQIVTKLVDV VKKKNKGGVA VKAHQVKNHM 

PTFDSQTKEN MTLQPKSFGS TCQLSEKFIK AAIGCGIVES ILNWVKFKAQ 



DQDQDGSHIK G 



RKEWLTNFME 
GLKPGQRKVL 
LNLLQPIGQF 
YIPIIPMVLI 
ELAPNQYVIS 



IKIVGLQYKK 
WPSLLRHRFL EEFITPIVKV 
SKEAKEYFAD MKRHRIQFKY 
PEDYLYGQTT TYLTYNDFIN 
EVKVAQLAGS 



GEVAILNSTT 



LPSPRGQRVI 
GTKTKKQTTL 
ATKTKFTMDL DSDEDFSDFD 
ADDVKGSVPL SSSPPATHFP 
TKRDPALNSG VSQKPDPAKT 
DFDSAVAPRA 



SCKIPNFDVR EIVNNIRRLM 
IEISELPVRT WTQTYKEQVL 
ERVGLHKVFK LQTSLTCHSM 
LGAESAKLNN QARFILEKID 
EENEESDNEK ETEKSDSVTD 
KSPSDLWKED LATFIEELEA 



KELILFSNSD NERSIPSMVD 
EHSLMMTIIN LAQNFVGSNN 
KDDHTLKFLY DDNQRVEPEW 
DGEEPLPMLP S 



VLFDHVGCLK 
GKIIIENKPK 
SGPTFNYLLD 
VEAKEKQDEQ 
NENTEGSPQE 



KYDTVLDILR 
KELIKVLIQR 
MPLWYLTKEK 



SPKLSNKELK 
AKSQSSTSTT 
IVSKAVTSKK 



DGVELEGLKQ 
PPRETEPRRA 
PQKSWSDLE 
GAKKRAAPKG 



Coding sequence: 148-70 95 



CGGCGAGGGG CCGCAGACCG T CTGG AAATG C 
CAGCTCCTCT GTGTTTGCCG CCTGGATTGG GCTAATGGAT ACTACAGACA ACAGAGAAAA 
CTTGTTGAAG AGATTGGCTG GTCCTATACA GGAGCACTGA ATCAAAAAAA TTGGGGAAAG 
AAATATCCAA CATGTAATAG CCCAAAACAA TCTCCTATCA ATATTGA1GA AGATCTTACA 
CAAGTAAATG TGAAT CTTAA GAAACTTAAA TTTCAGGGTT G " " " 



ATCATTGGAA 42 0 
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AACACATTCA TTCATAACAC TGGGAAAACA GTGGAAATTA ATCTCACTAA TGACTACCGT 4B0 

GTCAGCGGAG GAGTTTCAGA AATGGTGTTT AAAGCAAGCA AGATAACTTT TCACTGGGGA S40 

AAATGCAATA TGTCAT CTG A TGGATCAGAG CATAGTTTAG AAGGACAAAA ATTTCCACTT 600 

GAGATGCAAA TCTACTGCTT TGATGCGGAC CGATTTTCAA GTTTTGAGGA AGCAGTCAAA 660 

GGAAAAGGGA AGTTAAGAG C TTTATCCATT TTGTTTGAGG TTGGGACAGA AGAAAATTTG 720 

GATTTCAAAG CGATTATTGA TGGAGTCGAA AGTGTTAGTC GTTITGGGAA GCAGGCTGCT 780 

TTAGATCCAT TCATACTGTT GAACCTTCTG CCAAACTCAA CTGACAAGTA TTACATTTAC 840 

AATGGCTCAT TGACATCTCC TCCCTGCACA GACACAGTTG ACTGGATTGT TTTTAAAGAT 900 

ACAGTTAGCA TCTCTGAAAG CCAGTTGGCT GTTTTTTGTG AAGTTCTTAC AATGCAACAA 960 

TCTGGTTATG TCATGCTGAT GGACTACTTA CAAAACAATT TTCGAGAGCA ACAGTACAAG 1020 

TTCTCTAGAC AGGTGTTTTC CTCATACACT GGAAAGGAAG AGATIX rC/ ^AGTTTGT 1080 

AGTTCAGAAC C AGAAAATG T TCAGGCTGAC CCAGAGAATT ATACCAGCCT TCTTGTTACA 1140 

TGGGAAAGAC CTCGAGTCGT TTATGATACC ATGATTGAGA AGTTTGCAGT TTTGTACCAG 1200 

CAGTTGGATG GAG AGGAC CA AACCAAGCAT GAATTTTTGA CAGATGGCTA TCAAGACTTG 12 60 

GGTGCTATTC TCAATAATTT GCTACCCAAT ATGAGTTATG T7CTTCAGAT AGTAGCCATA 1320 

TGCACTAATG GCTTATATGG AAAATACAGC GACCAACTGA TTGTCGACAT GCCTACTGAT 1380 

AATCCTGAAC TTGATCTTTT CCCTGAATTA ATTGGAACTG AAGAAATAAT CAAGGAGGAG 1440 

GAAGAGGGAA AAGACATTGA AGAAGGCGCT ATTGTGAATC C7GGTAGAGA CAGTGCTACA 1500 

AACCAAATCA GGAAAAAGGA ACCCCAGATT TCTACCACAA CACACTACAA TCGCATAGGG 1560 

ACGAAATACA ATGAAGCCAA GACTAACCGA TCCCCAACAA GAGGAAGTGA ATTCTCTGGA 1620 

AAGGGTGATG TTCCCAATAC ATCTTTAAAT TCCACTTCCC AACCAGTCAC TAAATTAGCC 1680 

ACAGAAAAAG ATATTTCCTT GACTTCTCAG ACTGTGACTG AACTGCCACC TCACACTGTG 1740 

GAAGGTACTT CAGCCTCTTT AAATGATGGC TCTAAAACTG TTCTTAGATC TCCACATATG 1800 

AACTTGTCGG GGACTGCAGA ATCCTTAAAT ACAGTTTCTA TAACAGAATA TGAGGAGGAG 1860 

AGTTTATTGA CCAGTTTCAA GCTTGATACT GGAGCTGAAG ATTCTTCAGG CTCCAGTCCC 1920 

GCAACTTCTG CTATCCCATT CATCTCTGAG AACATATCCC AAGGG7ATAT ATTTTCCTCC 1980 

GAAAACCCAG AGACAATAAC ATATGATGTC CTTATACCAG AATCTGCTAG AAATGCTTCC 2040 

GAAGATTCAA CTTCATCAGG TTCAGAAGAA TCACTAAAGG ATCCTTCTAT GGAGGGAAAT 2100 

GTGTGGTTTC CTAGCTCTAC AGACATAACA GCACAGCCCG ATGTTGGATC AGGCAGAGAG 2160 

AGCTTTCTCC AGACTAATTA CACTGAGATA CGTGTTGATG AATCTGAGAA GACAACCAAG 2220 

TCCTTTTCTG CAGGCCCAGT GATGTCACAG GGTCCCTCAG TTACAGATCT GGAAATGCCA 2280 

CATTATTCTA CCTTTGCCTA CTTCCCAACT GAGGTAACAC CTCATGCTTT TACCCCATCC 2340 

TC CAGACAAC AGGATTTGGT CTCCACGGTC AACGTGGTAT ACTCGCAGAC AACCCAACCG 2400 

GTATACAATG GTGAGACACC TCTTCAACCT TCCTACAGTA GTGAAGTCTT TCCTCTAGTC 2460 

ACCCCTTTGT TGCTTGACAA TCAGATCCTC AACACTACCC CTGCTGCTTC AAGTAGTGAT 2520 

TCGGCCTTGC ATGCTACGCC TGTATTTCCC AGTGTCGATG TGTCATTTGA ATCCATCCTG 2580 

TCTTCCTATG ATGGTGCACC TTTGCTTCCA TTTTCCTCTG CTTCCTTCAG TAGTGAATTG 2640 

TTTCGCCATC TGCATACAGT TTCTCAAATC CTTCCACAAG TTACTTCAGC TACCGAGAGT 2 700 

GATAAGGTGC CCTTGCATGC TTCTCTGCCA GTGGCTGGGG GTGATTTGCT ATTAGAGCCC 2760 

AGCCTTGCTC AGT ATT CTG A TGTGCTGTCC ACTACTCATG CTGCTTCAGA GACGCTGGAA 2820 

TTTGGTAGTG AATCTGGTGT TCTTTATAAA ACGCTTATGT TTTCTCAAGT TGAACCACCC 2880 

AGCAGTGATG CCATGATGCA TGCACGTTCT TCAGGGCCTG AACCTTCTTA TGCCTTGTCT 2 940 

GATAATGAGG GCTCCCAACA CATCTTCACT GTTTCTTACA GTTCTGCAAT ACCTGTGCAT 3 00 0 

GATTCTGTGG GTGTAACTTA TCAGGGTTCC TTATTTAGCG GCCCTAGCCA TATACCAATA 3 0 60 

CCTAAGTCTT CGTTAATAAC CCCAACTGCA TCATTACTGC AGCCTACTCA TGCCCTCTCT 312 0 

GGTGATGGGG AATGGTCTGG AGCCTCTTCT GATAGTGAAT TTCTTTTACC TGACACAGAT 3180 

GGGCTGACAG CCCTTAACAT TTCTTCACCT GTTTCTGTAG CTGAATTTAC ATATACAACA 3 240 

TCTGTGTTTG GTGATGATAA TAAGGCGCTT TCTAAAAGTG AAATAATATA TGGAAATGAG 3300 

ACTGAACTGC AAATTCCTTC TTTCAATGAG ATGGTTTACC CTTCTGAAAG CACAGTCATG 3 3 60 

CCCAACATGT ATGATAATGT AAATAAGTTG AATGCGTCTT TACAAGAAAC CTCTGTTTCC 3420 
ATTTCTAGCA CCAAGGGCAT GTTTCCAGGG TCCCTTGCTC ATACCAC 
GATCATGAGA TTAGTCAAGT TCCAGAAAAT AACTTTTCAG TTCAACC 
TCTCAAGCAT CTGGTGACAC TTCGCTTAAA CCTGTGCTTA GTGCAAACTC A 
TCCTCTGACC CTGCTTCTAG TGAAATGTTA TCTCCTTCAA CTCAGCT — 
ACCTCAGCTT CTTTTAGTAC TGAAGTATTG CTACAACCTT CCTTTCAGGC T 

GACACCTTGC TTAAAACTGT TCTTCCAGCT GTGCCCAGTG ATCCAATATT GGTTGAAACC 3780 

CCCAAAGTTG ATAAAATTAG TTCTACAATG TTGCATCTCA TTGTATCAAA TTCTGCTTCA 3B40 

AGTGAAAACA TGCTGCACTC TACATCTGTA CCAGTTTTTG ATGTGTCGCC TACTTCTCAT 3 900 

ATGCACTCTG CTTCACTTCA AGGTTTGACC ATTTCCTATG CAAGTGAGAA ATATGAACCA 3960 

GTTTTGTTAA AAAGTGAAAG TTCCCACCAA GTGGTACCTT CTTTGTACAG TAATGATGAG 4 020 

TTGTTCCAAA CGGCCAATTT GGAGATTAAC CAGGCCCATC CCCCAAAAGG AAGGCATGTA 4080 

TTTGCTACAC CTGTTTTATC AATTGATGAA CCATTAAATA CACTAATAAA TAAGCTTATA 4140 

CATTCCGATG AAATTTTAAC CTCCACCAAA AGTTCTGTTA CTGGTAAGGT ATTTGCTGGT 4200 

ATTCCAACAG TTGCTTCTGA TACATTTGTA TCTACTGATC ATTCTGTTCC TATAGGAAAT 4260 

GGGCATGTTG CCATTACAGC TGTTTCTCCC CACAGAGATG GTTCTGTAAC CTCAACAAAG 4320 

TTGCTGTTTC CTTCTAAGGC AACTTCTGAG CTGAGTCATA GTGCCAAATC TGATGCCGGT 4380 

TTAGTGGGTG GTGGTGAAGA TGGTGACACT GATGATGATG GTGATGATGA TGATGATGAC 4440 

AGAGGTAGTG ATGGCTTATC CATT CAT AAG TGTATGTCAT GCTCATCCTA TAGAGAATCA 4500 

CAGGAAAAGG TAATGAATGA TTCAGACACC CACGAAAACA GTCTTATGGA T CAGAAT AAT 4 560 

CCAATCTCAT ACTCACTATC TGAGAATTCT GAAGAAGATA ATAGAGTCAC AAGTGTATCC 4620 

TCAGACAGTC AAACTGGTAT GGACAGAAGT CCTGGTAAAT CACCATCAGC AAATGGGCTA 4 680 

TCCCAAAAGC ACAATGATGG AAAAGAGGAA AATGACATTC AGACTGGTAG TGCTCTGCTT 4740 

CCTCTCAGCC CTGAATCTAA AGCATGGGCA GTTCTGACAA GTGATGAAGA AAGTGGAT CA 4800 

GGGCAAGGTA CCT CAGAT AG CCTTAATGAG AATGAGACTT CCACAGATTT CAGTTTTGCA 4860 

GACACTAATG AAAAAGATGC TGATGGGATC CTGGCAGCAG GTGACTCAGA AATAACTCCT 4 920 

GGATTCCCAC AGTCCCCAAC ATCATCTGTT ACTAGCGAGA ACTCAGAAGT GTTCCACGTT 4980 

TCAGAGGCAG AGGCCAGTAA TAGTAGCCAT GAGTCTCGTA TTGGTCTAGC TGAGGGGTTG 5040 

GAATCCGAGA AGAAGGCAGT TATACCCCTT GTGATCGTGT CAGCCCTGAC TTTTATCTGT 5100 

CTAGTGGTTC TTGTGGGTAT TCTCATCTAC TGGAGGAAAT GCTTCCAGAC TGCACACTTT 5160 

TACTTAGAGG ACAGTACAT C CCCTAGAGTT ATATCCACAC CTCCAACACC TATCTTTCCA 5220 

ATTTCAGATG ATGTCGGAGC AATTCCAATA AAGCACTTTC CAAAGCATGT TGCAGATTTA 5280 

CATGCAAGTA GTGGGTTTAC TGAAGAATTT GAGACACTGA AAGAGTTTTA CCAGGAAGTG 534 0 

CAGAGCTGTA CTGTTGACTT AGGTATTACA GCAGACAGCT CCAACCACCC AGACAACAAG 5400 

CACAAGAATC GATACATAAA TATCGTTGCC TATGATCATA GCAGGGTTAA GCTAGCACAG 5460 

CTTGCTGAAA AGGATGGCAA ACTGACTGAT TATATCAATG CCAATTATGT TGATGGCTAC 5520 

AACAGACCAA AAGCTTATAT TGCTGCCCAA GGCCCACTGA AATCCACAGC TGAAGATTTC 5580 

TGGAGAATGA TATGGGAACA TAATGTGGAA GTTATTGTCA TGATAACAAA CCTCGTGGAG 5640 
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AAAGGAAGGA GAAAATGTGA TCAGTACTGG CCTGCCGATG GGAGTGAGGA GTACGGGAAC 5700 

TTTCTGGTCA CTCAGAAGAG TGTGCAAGTG CTTGCCTATT ATACTGTGAG GAATTTTACT 57 SO 

CTAAGAAACA CAAAAATAAA AAAGGGCTCC CAGAAAGGAA GACCCAGTGG ACGTGTGGTC 582 0 

ACACAGTATC ACTACACGCA GTGGCCTGAC ATGGGAGTAC CAGAGTACTC CCTGCCAGTG 5880 

CTGACCTTTG TGAGAAAGGC AGCCTATGCC AAGCGCCATG CAGTGGGGCC 7GTTGTCGTC 5940 

CACTGCAGTG CTGGAGTTGG AAGAACAGGC ACATATATTG TGCTAGACAG TATGTTGCAG 6000 

CAGATTCAAC ACGAAGGAAC TGTCAACATA TTTGGCTTCT TAAAACACAT CCGTTCACAA 6060 

AGAAA1TATT TGGTACAAAC TGAGGAGCAA TATGTCTTCA TTCATGATAC ACTGGTTGAG 612 0 

GCCATACTTA GTAAAGAAAC TGAGGTGCTG GACAGT CAT A TTCATGCCTA TGTTAATGCA 6180 

CTCCTCATTC CTGGACCAGC AGGCAAAACA AAGCTAGAGA AACAATTCCA GCTCCTGAGC 6240 

CAGTCAAATA TACAGCAGAG TGACTATTCT GCAGCCCTAA AGCAATGCAA CAGGGAAAAG 6300 

AATCGAACTT CTTCTATCAT CCCTGTGGAA AG AT CAAGGG TTGGCATTTC ATCCCTGAGT 63 60 

GGAGAAGGCA CAGACTACAT CAATGCCTCC TATATCATGG GCTATTACCA GAGCAATGAA 6420 

TTCATCATTA CCCAGCACCC TCTCCTTCAT ACCATCAAGG ATTTCTGGAG GATGATATGG 6480 

GACCATAATG CCCAACTGGT GGTTATGATT CCTGATGGCC AAAACATGGC AGAAGATGAA 6540 

TTTGTTTACT GGCCAAATAA AGATGAGCCT ATAAATTGTG AGAGCTTTAA GGTCACTCTT 6600 

ATGGCTGAAG AACACAAATG TCTATCTAAT GAGGAAAAAC TTATAATTCA GGACTTTATC 6660 

TTAGAAGCTA CACAGGATGA TTATGTACTT GAAGTGAGGC ACTTTCAGTG TCCTAAATGG 6720 

CCAAATCCAG ATAGCCCCAT TAGTAAAACT TTTGAACTTA TAAGTGTTAT AAAAGAAGAA 6780 

GCTGCCAATA GGGATGGGCC TATGATTGTT CATGATGAGC ATGGAGGAGT GACGGCAGGA 6840 

ACTTTCTGTG CTCTGACAAC CCTTATGCAC CAACTAGAAA AAGAAAATTC CGTGGATGTT 6900 

TACCAGGTAG CCAAGATGAT CAATCTGATG AGGCCAGGAG TCTTTGCTGA CATTGAGCAG 6960 

TATCAGTTTC TCTACAAAGT GATCCTCAGC CTTGTGAGCA CAAGGCAGGA AGAGAATCCA 7020 

TCCACCTCTC TGGACAGTAA TGGTGCAGCA TTGCCTGATG GAAATATAGC TGAGAGCTTA 7080 

GAGTCTTTAG TTTAACACAG AAAGGGGTGG GGGGACTCAC ATCTGAGCAT TGTTTTCCTC 7140 

TTCCTAAAAT TAGGCAGGAA AATCAGTCTA GTTCTGTTAT CTGTTGATTT CCCATCACCT 72 00 

GACAGTAACT TTCATGACAT AGGATTCTGC CGCCAAATTT ATATCATTAA CAATGTGTGC 7260 

CTTTTTGCAA GACTTGTAAT TTACTTATTA TGTTTGAACT AAAATGATTG AATTTTACAG 7320 

TATTTCTAAG AATGGAATTG TGGTATTTTT TTCTGTATTG ATTTTAACAG AAAATTTCAA 7380 

TTTATAGAGG TTAGGAATTC CAAACTACAG AAAATGTTTG TTTTTAGTGT CAAATTTTTA 7440 

GCTGTATTTG TAGCAATTAT CAGGTTTGCT AGAAATATAA CTTTTAATAC AGTAGCCTGT 7500 

AAATAAAACA CTCTTCCATA TGATATTCAA CATTTTACAA CTGCAGTATT CACCTAAAGT 7560 

AGAAATAATC TGTTACTTAT TGTAAATACT GCCCTAGTGT CTCCATGGAC CAAATTTATA 7620 

TTTATAATTG TAGATTTTTA TATTTTACTA CTGAGTCAAG TTTTCTAGTT CTGTGTAATT 7680 

GTTTAGTTTA ATGACGTAGT TCATTAGCTG GTCTTACTCT ACCAGTTTTC TGACATTGTA 7740 

TTGTGTTACC TAAGTCATTA ACTTTGTTTC AGCATGTAAT TTTAACTTTT GTGGAAAATA 7800 

GAAATACCTT CATTTTGAAA GAAGTTTTTA TGAGAATAAC ACCTTACCAA ACATTGTTCA 7860 

AATGGTTTTT ATCCAAGGAA TTGCAAAAAT AAATATAAAT ATTGCCATIA AAAAAAAAAA 7920 



Seq ID NO: 180 Protein sequence: 
Protein Accession # : Eos sequence 

1 11 21 31 41 SI 

1 1 1 I I I 

MRILKRFLAC IQLLCVCRLD WANGYYRQQR KFiVEEIGWSY TGALNQKNWG KKYPTCNSPK 60 

OSPINIDEDL TQVNVNLKKL KFQGWDKTSL ENTFIHNTGK TVEINLTNDY RVSGGVSEMV 120 

FKASKITFHW GKCNMSSDGS EHSLEGQKFP LEMQIYCFDA DRFSSFEEAV KGKGKLRALS 1B0 

ILFEVGTEEN LDFKAIIDGV ESVSRFGKQA ALEPFILLNL LPNSTDKYYI YNGSLTSPPC 24C 

TDTVDWIVFK DTVSISESQL AVFCEVLTMQ QSGYVMLMDY LQNNFREQQY KFSRQVFSSY 300 

TGKEEIHEAV CSSEPENVQA DPENYTSLLV TWERPUWYD TMIEKFAVLY QQLDGEDQTK 3 60 

HEFLTDGYQD LGAILNNLLP KMSYVLQIVA ICTNGLYGKY SDQLIVDMPT DNPELDLFPE 420 

LIGTEEI IKE EEEGKDIEEG AIVNPGRDSA TNQIRKKEPQ ISTTTHYNRI GTKYNEAKTN 4B0 

RSPTRGSEFS GKGDVPNTSL NSTSQPVTKL ATEKDISLTS QTVTELPPHT VEGTSASLND 540 

GSKTVLRSPH MNLSGTAESL NTVSITEYEE ESLLTSFKLD TGAEDSSGSS PATSAIPFIS 600 

ENISQGYIFS SENPETITYD VLIPESARNA SEDSTSSGSE ESLKDPSMEG KVWFPSSTDI 660 

TAQPDVGSGR ESFLQTNYTE IRVDESEKTT KSFSAGPVHS QGPSVTDLEM PHYSTFAYFP 720 

TEVTPHAFTP SSRQQDLVST VNWYSQTTQ PVYNGETPLQ PSYSSEVFPL VTPLLLDNQI 7B0 

LNTTPAASSS DSALHATPVF PSVDVSFESI LSSYDGAPLL PFSSASFSSE LFRHLHTVSQ 840 

ILPQVTSATE SDKVPLHASL PVAGGDLLLE PSLAQYSDVL STTHAASETL EFGSESGVLY 900 

KTLMFSQVEP PSSDAMMHAR SSGPEPSYAL SDNEGSQKIF TVSYSSAIPV HDSVGVTYQG 960 

SLFSGPSHIP IPKSSLITPT ASLLQPTHAL SGDGEWSGAS SDSEFLLPDT DGLTALNISS 1020 

PVSVAEFTYT TSVFGDDNKA LSKSEIIYGN ETELQIPSFN EMVYPSESTV MPKMYDNVNK 1080 

LNASLQETSV SISSTKGMFP GSIiAHTTTKV FDKEISQVPE NNFSVQPTHT VSQASGDTSL 1140 

KPVLSANSEP ASSDPASSEM LSPSTQLLFY EISASFSTEV LLQPSFQASD VDTLLKTVLP 1200 

AVPSDPILVE TPKVDKISST MLHLIVSNSA SSEHMLHSTS VPVFDVSPTS H.MHSASLQGL 1260 

TISYASEKYE PVLLKSESSH QWPSLYSND ELFQTANLEI NQAHPPKGRH VFATPVLSID 1320 

EPLNTLINKL IHSDEILTST KSSVTGKVFA GIPTVASDTF VSTDKSVPIG NGKVAITAVS 1380 

PHRDGSVTST KLLFPSKATS ELSHSAKSDA GLVGGGEDGD TDDDGDDDDD DRGSDGLSIH 1440 

KCMSCSSYRE SQEKVMNDSD THENSLMDQN NPISYSLSEN SE3DKRVTSV SSDSQTGMDR 1500 

SPGKSPSANG LSQKHNDGKE ENDIQTGSAL LPLSPESKAW AVLTSDEESG SGQGTSDSLN 1560 

ENETSTDFSF ADTNEKDADG I LAAGDSEIT PGFPQSPTSS VTSENSEVFH VSEAEASNSS 1620 

HESRIGLAEG LESEKKAVIP LVIVSALTFI CLWLVGILI YWRKCFQTAH FYLEDSTSPR 1680 

VISTPPTPIF PISDDVGAIP IKHFPKHVAD LHASSGFTEE FETLKEFYQE VQSCTVDLGI 1740 

TADSSNHPDN KHKNRYINIV AYDHSRVKLA QLAEKDGKLT DYINANYVDG YNRPKAYIAA 1800 

QGPLKSTAED FWRMIWEHNV EVIVMITNLV EKGRRKCDQY WPADGSEEYG NFLVTQKSVQ 1860 

VLAYYTVRNF TLRNTKIKKG SQKGRPSGRV VTQYHYTQWP DHGVPEYSLP VLTFVRKAAY 1920 

AKRHAVGPW VHCSAGVGRT GTYIVLDSML QQIQHEGTVN IFGFLKHIRS QRNYLVQTEE 1980 

QYVFIHDTLV EAILSKETEV LDSHIHAYVH ALLIPGPAGK TKLEKQFQLL SQSNIQQSDY 2040 

SAALKQCNRE KNRTSSIIPV ERSRVGISSL SGEGTDYINA SYIMGYYQSN EFIITOHPLL 2100 

HTIKDFWRMI WDHNAQLWM I PDGQNMAED EFVYWPNKDE PINCESFKVT LMAEEHKCLS 2160 

NEEKLIIQDF ILEATQDDYV LEVRHFQCPK WPNPDSPISK TFELISVIKE EAANRDGPHI 2220 

VHDEHGGVTA GTFCALTTLM HQLEKENSVD VYQVAKMINL HRPGVFADIE QYQFLYKVIL 2280 
SLVSTRQEEN PSTSLDSNGA ALPDGNIAES LESLV 
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20 
25 
30 



CACGCACGAT 
ATTTCCTTCG 
CCGCAGACCG 



I 



CAGCTCCTCT 
CTTGTTGAAG AGATTGGCTG 
AAATATCCAA CATGTAATAG 
CAAG1AAATG TGAATCTTAA 
AACACATTCA TTCATAACAC 
GTCAGCGGAG GAGTTTCAGA 
AAATGCAATA TGTCATCTGA 



CTCCCCCTCC 
TCTGGAAATG 
CCTGGATTGG 
GTCCTATACA 
CCCAAAACAA 
GAAACTTAAA 



31 41 
I I 
■ TCTATACACT GGAC-GATTAA 
CTCTCCACTC TGAGAAGCAG 
CGAATCCTAA AGCGTTTCCT 
GCTAATGGAT ACTACAGACA 
GGAGCACTGA ATCAAAAAAA 
TCTCCTATCA ATATTGATGA 
TTTCAGGGTI GGGATAAAAC 



AACAAACAAA 



GGAAAAGGGA 



CGATTATTGA 



ACAGTTAGCA 1 



AATGGTGTTT AAAGCAAGCA AGATAACTTT 
TGGATCAGAG CATAGTTTAG AAGGACAAAA 
TGATGCGGAC CGATTTTCAA GTTTTGAGGA 
TTTATCCATT TTGTTTGAGG T7GGGACAGA 
TGGAGTCGAA AGTGTTAGTC GTTTTGGGAA 
GAACCTTCTG CCAAACTCAA CTGACAAGTA 
TCCCTGCACA GACACAGTTG ACTGGATTGT 
CCAGTTGGCT GTTTTTTGTG AAGTTCTTAC 



CGCTTGCATT 180 

ACAGAGAAAA 240 

TTGGGGAAAG 300 

AGATCTTACA 360 

ATCATTGGAA 420 

TGACTACCGT 480 

TCACTGGGGA 540 

ATTTCCACTT SOO 

AGCAGTCAAA S60 

AGAAAATTTG 720 

GCAGGCTGCT 780 

TTTTAAAGAT 900 



TTCTCTAGAC 
AGTT CAGAAC 
TGGGAAAGAC 
CAGTTGGATG 
GGTGCTATTC 
TGCACTAATG 
AATCCTGAAC 
GAAGAGGGAA 
AACCAAATCA 



GAGAGGACCA 
TCAATAATTT 
GCTTATATGG 
TTGATCTTTT 
AAGACATTGA 
GGAAAAAGGA 
ATGAAGCCAA 
AAGGGTGATG TTCCCAATAC 



CTCATACACT GGAAAGGAAG AGATTCATGA A 
TCAGGCTGAC CCAGAGAATT ATACCAGCCT TCTTGTTACA 
ATGATTGAGA AGTTTGC 
GAATTTTTGA CAGATGGCTA T' 
GCTACCCAAT ATGAGTTATG TTCTTCAGAT AGTAGCCATA 
AAAATACAGC GACCAACTGA TTGTCGACAT G 



GAAGGTACTT 
AACTTGTCGG 
AGTTTATTGA 
GCAACTTCTG 
GAAAACCCAG 
GAAGATTCAA 



ATATTTCCTT 
CAGCCTCTTT 
GGACTGCAGA 
CCAGTTTCAA 
CTATCCCATT 
AGACAATAAC 
CTTCATCAGG 
CTAGCTCTAC 



AGAAGGCGCT 
ACCCCAGATT 
GACTAACCGA 
ATCTTTAAAT 
GACTTCTCAG 
AAATGATGGC 
ATCCTTAAAT 
GCTTGATACT 



ATATGATGTC 
TTCAGAAGAA 
AGACATAACA 
CACTGAGATA 
GATGTCACAG 
CTTCCCAACT 
CTCCACGGTC 



TCCACTTCCC 
ACTGTGACTG AACTGCCACC 
TCTAAAACTG TTCTTAGATC 
ACAGTTTCTA TAACAGAATA 
GGAGCTGAAG ATTCTTCAGG 
AACATATCCC AAGGGTATAT 
CTTATACCAG AATCTGCTAG 
TCACTAAAGG ATCCTTCTAT 



TAAATTAGCC 
TCACACTGTG 
TCCACATATG 



CGTGTTGATG 
GGTCCCTCAG 
GAGGTAACAC 
AACGTGGTAT 



TCCAGACAAC 

GTATACAATG CAGAGGCCAG TAATAGTAGC CATGAGTCTC 
TTGGAATCCG 
TGTCTAGTGG 
TTTTACTTAG 
CCAATTTCAG 



TTACAGATCT 
CTCATGCTTT 
ACTCGCAGAC 



1560 
1680 



CTCCAGTCCC 1920 



2100 
2160 
2220 



AAATGCTTCC 



AGGCAGAGAG 
GACAACCAAG 
GGAAATGCCA 
TACCCCATCC 
AACCCAACCG 



AGTTATACCC 
TATTCTCATC 
ATCCCCTAGA 



GTGCAGAGCT 



TACTGAAGAA 
AAATATCGTT 



ATGATGTCGG 
GTAGTGGGTT 
GTACTGTTGA 
ATCGATACAT 
CAGCTTGCTG AAAAGGATGG 
TACAACAGAC CAAAAGCTTA 
TTCTGGAGAA 
GAGAAAGGAA 
AACTTTCTGG 
ACTCTAAGAA ACACAAAAAT AAAAAAGGGC 
GCAGTGGCCT 
GGCAGCCTAT 
TGGAAGAACA 



CTTGTGATCG 
TACTGGAGGA 
GTTATATCCA 
AIAAAGCACT 
TTTGAGACAC 
ACAG CAGACA 
GCCTATGATC 
GATTATATCA 
TATTGCTGCC CAAGGCCCAC 



TGTCAGCCCT 
AATGCTTCCA 
CACCTCCAAC 
TTCCAAAGCA 



T AGCTGAGCGG 246 0 



GCTCCAACCA 



ft GGAGAAAATG T 



GTGCTGACCT 
GTCCACTGCA 
CAGCAGATTC 
CAAAGAAATT 
GAGGCCATAC 



TTGTGAGAAA 
GTGCTGGAGT 
AACACGAAGG 
ATTTGGTACA 
TTAGTAAAGA 



GAAGTTATTG 
TGGCCTGCCG 
G1GCTTGCCT 
TCCCAGAAAG 
GACATGGGAG 
GCCAAGCGCC 



ATGCCAATTA 
TGAAATCCAC 
TCATGATAAC 



GACTTTTATC 
GACTGCACAC 
ACCTATCTTT 
TGTTGCAGAT 
TTACCAGGAA 
CCCAGACAAC 
TAAGCTAGCA 



2 32 0 
2380 
2 94 0 



AGCCAGTCAA 
AAGAATCGAA CTTCTTCTAT 
AGTGGAGAAG GCACAGACTA 
GAATTCATCA TTACCCAGCA 
TGGGACCATA 
GAATTTGTTT ACTGGCCAAA 



AACTGAGGAG 
AACTGAGGTG 
AGCAGGCAAA 



CAATATGTCT 



AAACCTCGTG 
GGAGTACGGG 
GAGGAATTTT 
TGGACGTGTG 
CTCCCTGCCA 
GCCTGTTGTC 
CAGTATGTTG 
CATCCGTTCA 



GAAGCTGCCA 
GGAACTTTCT 
GTTTACCAGG 
CAGTATCAGT 
CCATCCACCT 
TTAGAGTCTT 
CTCTTCCTAA 
CCTGACAGTA 
TGCCTTTTTG 
CAGTATTTCT 
CAATTTATAG 
TTAGCTGTAT 
TGTAAATAAA 



CTACACAGGA 
CAGATAGCCC 
ATAGGGATGG 
GTGCTCTGAC 
TAGCCAAGAT 
TTCTCTACAA 
CTCTGGACAG 
TAGTTTAACA 



CATCCCTGTG 

CCCTCTCCTT 

TAAAGATGAG 
ATGTCTATCT 
TGATTATGTA 
CATTAGTAAA 



TCTGCAGCCC 



CCAGCTCCTG 



TCCTATATCA 
CATACCATCA 
A1TCCTGATG 
CCTATAAATT 
AATGAGGAAA 
CTTGAAGTGA 
ACTTTTGAAC 
GTTCATGATG 
CACCAACTAG 



T TTCATCCCTG 



GAGGATGATA 



ACTTTCATGA 
CAAGACTTGT 
AAGAATGGAA 
AGGTTAGGAA 
TTGTAGCAAT 



TAATGGTGCA 
CAGAAAGGGG 
GAAAATCAGT 
CATAGGATTC 
AATTTACTTA 
TTGTGGTATT 
TTCCAAACTA 
TATCAGGTTT 
ATATGATATT 



CTAGTTCTGT 



TTATGTTTGA 
TTTTTCTGTA 
CAGAAAATGT 
GCTAGAAATA 
CAACATTTTA 



GGCACTTTCA 
TTATAAGTGT 
AGCATGGAGG 
AAAAAGAAAA 
GAGTCTTTGC 
GCACAAGGCA 
ATGGAAATAT 
CACATCTGAG 
TATCTGTTGA 
'TATATCAT 
ACTAAAATGA 
TTGATTTTAA 



TAAGGTCACT 
TCAGGACTTT 
GTGTCCTAAA 
TATAAAAGAA 



TGACATTGAG 



AGCTGAGAGC 
CATTGTT7TC 
TTTCCCATCA 
TAACAATGTG 
TTGAATTTTA 
CAGAAAATTI 
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AGTAGAAATA ATCTGTTACT TATTGTAAAT ACTGCCCTAG TGTCTCCATG GACCAAATTT 5 040 

ATATTTATAA TTGTAGATTT TTATATTTTA CTACTGAGTC AAGTTTTCTA GTTCTGTGTA 5100 

ATTGTTTAGT TTAATGACGT AGTTCATTAG CTGGTCTTAC TCTACCAGTT TTCTGACATT 5160 

GTATTGTGTT ACCTAAGTCA TTAACTTTGT TTCAGCATGT AATTTTAACT TTTGTGGAAA 5220 

ATAGAAATAC CTTCATTTTG AAAGAAGT T T TTATGAGAAT AACACCTTAC CAAACATTGT 5280 

TCAAATGGTT TTTATCCAAG GAATTGCAAA AATAAATATA AATATTGCCA TTAAAAAAAA 5340 
AAAAAAAAAA AAAAAAAAAA AAAAAAA 
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20 

25 



ILFEVGTEEN 



CSSEPENVQA 



GSKTVLRSPH 
ENISQGYIFS 
TAQPDVGSGR 



EEEGKDIEEG 
GKGDVPNTSL 
MNLSGTAESL 



I 

• WANGYYRQQR 
KFQGWDKTSL 
EHSLEGQKFP 
ESVSRFGKQA 
AVFCEVLTMQ 
DPENYTSLLV 
NMSYVLQIVA 



KLVEEIGWSY 
LEMQIYCFDA 



I I 
TGALNQKNVJG KKYPTCNSPK 
TVEINLTMDY RVSGGVSEMV 
DRFSSFEEAV KGKGKLRALS 
LPNSTDKYYI YNGSLTSPPC 
LQNNFREQQY KFSRQVFSSY 



PLVIVSALTF 
PIKHFPKHVA 
VAYDHSRVKL 



ESFLQTNYTE 
SSRQQDLVST 
ICLWLVGIL 
DLHASSGFTE 
AQLAEKDGKL 



GSQKGRPSGR WTQYHYTQW 
TGTYIVLDSM LQQIQHEGTV 
VLDSHIHAYV NALLIPGPAG 
VERSRVGISS LSGEGTDYIN 
MIPDGQNMAE DEFVYWPNKD 
VLEVRHFQCP KWPNPDSPIS 
MHQLEKENSV DVYQVAKMIN 



NSTSQPVTKL 
NTVSITEYEE 
VLIPESARNA 
IRVDESEKTT 
VNWYSQTTQ 
IYWRKCFQTA 
EFETLKEFYQ 
TDYINANYVD 
YWPADGSEEY 
PDMGVPEYSL 
NIFGFLKHIR 
KTKLEKQFQL 
ASYIMGYYQS 
EPINCESFKV 
KTFELISVIK 



TNQIRKKEPQ 
ATEKDISLTS 
ESLLTSFKLD 



SDQLIVDMPT 
ISTTTHYNRI 
QTVTELPPHT 



KSFSAGPVMS 
PVYNAEASNS 
HFYLEDSTSP 
EVQSCTVDLG 
GYNRPKAYIA 
GNFLVTQKSV 



QGPSVTDLEM 
SHESRIGLAE 
RVISTPPTPI 



NVKFPSSTDI 



SQRNYLVQTE 
LSQSNIQQSD 
NEFIITQHPL 
TLMAEEHKCL 
EEAANRDGPM 
EQYQFLYKVI 



FTLRMTKIKK 
WHCSAGVGR 
VEAILSKETE 
EKNRTSSIIP 



SNEEKLIIQD 
IVHDEHGGVT 
LSLVSTRQEE 



CAAAAAAAAC A 



CTCACTTCGA 
CTCCCCCTCC 
TCTGGAAATG 
CCTGGATTGG 
GTCCTATACA 
CATGTAATAG CCCAAAACAA 
TGAATCTTAA GAAACTTAAA 
TTCATAACAC TGGGAAAACA 
GTCAGCGGAG GAGTTTCAGA AATGGTGTTT 
AAATGCAATA TGTCAT CTGA 



CTTGTTGAAG 
AAATATCCAA 
CAAGTAAATG 
AACACATTCA 



GGAAAAGGGA 
GATTTCAAAG 
TTAGATCCAT 
AATGGCTCAT 
ACAGTTAGCA 
TCTGGTTATG 
TTCTCTAGAC 
AGTTCAGAAC 
TGGGAAAGAC 



AGTTAAGAGC 
CGATTATTGA 
TCATACTGTT 
TGACATCTCC 
TCTCTGAAAG 



AGGTGTTTTC 
CAGAAAATGT 
CTCGAGTCGT 



GGTGCTATTC 
AATCCTGAAC 



TTTATCCATT 
TGGAGTCGAA 
GAACCTTCTG 
TCCCTGCACA 
CCAGTTGGCT 
GGACTACTTA 
CTCATACACT 
TCAGGCTGAC 
TTATGATACC 
AACCAAGCAT 



I 

TCTATACACT 
CTCTCCACTC 
CGAATCCTAA 
GCTAATGGAT 
GGAGCACTGA 
TCTCCTATCA 
TTTCAGGGTT 
GTGGAAATTA 
AAAGCAAGCA 
CATAGTTTAG 
CGATTTTCAA 



GGAGGATTAA AACAAACAAA 



ATATTGATCA 
GGGATAAAAC 
ATCTCACTAA 
AGATAACTTT 
AAGGACAAAA 



CGCTTGCATT 
ACAGAGAAA.A 
TTGGGGAAAG 
AGATCTTACA 
ATCATTGGAA 
TGACTACCGT 
TCACTGGGGA 



AGTGTTAGTC 



TTGGGACAGA 
GTTTTGGGAA 
CTGACAAGTA 
ACTGGATTGT 



AGCAGTCAAA 



GCAGGCTGCT 



TTCGAGAGCA 
AGATTCATGA 
ATACCAGCCT 
AGTTTGCAGT 
GAATTTTTGA CAGATGGCTA 
TCAATAATTT GCTACCCAAT ATGAGTTATG TTCTTCAGAT 



TTGATCTTTT 
AAGACATTGA 
GGAAAAAGGA 



AAGGGTGATG TTCCCAATAC 



AACTTGTCGG 



GCAACTTCTG 
GAAAACCCAG 
GAAGATTCAA 



GGACTG CAGA 
CCAGTTTCAA 
CTATCCCATT 



CTTCATCAGG 



AAAATACAGC 
CCCTGAATTA 
AGAAGGCGCT 
ACCCCAGATT 
GACTAACCGA 
ATCTTTAAAT 
GACTTCTCAG 
AAATGATGGC 
ATCCTTAAAT 
GCTTGATACT 
CATCTCTGAG 
ATATGATGTC 



AATGCAACAA 
ACAGTACAAG 
AGCAGTTTGT 

TTTGTACCAG 
TCAAGACTTG 
AGTAGCCATA 
GCCTACTGAT 



TCTACCACAA 
TCCCCAACAA 
TCCACTTCCC 
ACTGTGACTG 



A CAGTGCTACA 



GAGGAAGTGA A' 
AACCAGTCAC TAAATTAGCC 
AACTGCCACX: Ti 



CAGGCCCAGT 



AGACATAACA 
CACTGAGATA 
G ATGT CACAG 
CTTCCCAACT 
CTCCACGGTC 
TAGTAGCCAT 



CTTATACCAG AATCTGCTAG A 
TCACTAAAGG ATCCT7CTAT GGAGGGAAAT 
GCACAGCCCG ATGTTGGATC A 



TTACAGATCT GGAAATGCCA 
GAGGTAACAC CTCATGCTTT TACCCCATCC 
AACGTGGTAT ACTCGCAGAC AACCCAACCG 
GAGTCTCGTA TTGGTCTAGC TGAGGGGTTG 
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GAATCCGAGA AGAAGGCAGT 
CTAGTGGTTC TTGTGGGTAT 
3 ACAGTACATC 
3 ATGTCGGAG C AATTCCAATA 



3 TTTTATCTGT 2 52 0 
3 TGCACACTTT 2580 



ATCGTTGCCT 



AATGTGGAAG 



GTGCAAGTGC 
AAGGGCTCCC 
TGGCCTGACA 



AGAACAGGCA 
GTCAACATAT 
GAGGAGCAAT 
GAGGTGCTGG 
GGCAAAACAA 
GACTATTCTG 
CCTGTGGAAA 
AATGCCTCCT 
CTCCTTCATA 
GTTATGATTC 
GATGAGCCTA 
CTATCTAATG 
TATGTACTTG 
AGTAAAACTT 
ATGATTGTTC 
CTTATGCACC 
AATCTGATGA 
ATCCTCAGCC 
GGTGCAGCAT 
AAGGGGTGGG 
ATCAGTCTAG 
GGATTCTGCC 
TACTTATTAT 
GGTATTTTTT 
AAACTACAGA 
AGGTTTGCTA 
GATATTCAAC 
GTAAATACTG 
ATTTTACTAC 
CATTAGCTGG 
CTTTGTTTCA 
AAGTTTTTAT 
TGCAAAAATA 



CAGACAGCTC CAACCACCCA 
ATGATCATAG CAGGGTTAAG 
ATATCAATGC CAATTATGTT 
GCCCACTGAA ATCCACAGCT 
TTATTGTCAT GATAACAAAC 
CTGCCGATGG GAGTGAGGAG 
TTGCCTATTA 
AGAAAGGAAG 
TGGGAGTACC 
AGCGCCATGC 



GTGATCGTGT CAG3GGTGAG T 
TGGAGGAAAT GGTTGCAGAG T 
CTCCAACACC T, 
CAAAGCATGT TGCAGATTTA 
GAGGAAGTGC AGAGCTGTAC T 
GACAACAAGC ACAAGAATCG A 
CTAGCACAGC TTGGTGAAAA GGATGGCAAA 
ACAGACCAAA AGCTTATATT 
GGAGAATGAT ATGGGAACAT 
CTCGTGGAGA AAG3AAGGAG AAAATG1GAT 
TTCTGGT3AC TCAGAAGAGT 
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ACCCAGTGGA 



CTGCCAGTGC T 



TTGGCTTCTT 
ATGTCTTCAT 
ACAGTCATAT 
AGCTAGAGAA 
CAGCCCTAAA 
GATCAAGGGT 
ATATCATGGG 
CCATCAAGGA 
CTGATGGCCA 
TAAATTGTGA 



GCTAGACAGT 
AAAACACATC 
TCATGATACA 



CGTTCACAAA G 



ACAATTCCAG 
GCAATGCAAC 
TGGCATTTCA 
CTATTACCAG 
TTTCTGGAGG 
AAACATGGCA 
GAGCTTTAAG 
TATAATTCAG 
AAGTGAGGCA CTTTCAGTGT 
TTGAACTTAT AAGTGTTATA 
ATGATGAGCA TGGAGGAGTG 
AACTAGAAAA AGAAAATTCC 
GGCCAGGAGT CTTTGCTGAC 
TTGTGAGCAC AAGGCAGGAA 
TGCCTGATGG AAATATAGCT 
GGGACTCACA TCTGAGCATT G* 
TTCTGTTATC TGTTGATTTC O 



AGTGAAATAT 
ATCGAACTTC 
GAGAAGGCAC 



ATGATATGGG 



r TCATCATTAC 



ACCATAATGC 
TTGTTTACTG 
TGGCTGAAGA 
TAGAAGCTAC 
CAAATCCAGA 



CTACACGCAG 

T3GAGTTGGA 
CGAAGGAACT 
GGTACAAACT 
TAAAGAAACT 
T3GACCAGCA 
ACAGCAGAGT 
TTCTATCATC 
AGACT ACAT C 
CCAGCACCCT 
CCAACTGGTG 
GCCAAATAAA 
ACACAAATGT 
ACAGGATGAT 
TAGCCCCATT 



TATCATTAAC 



CTTTCTGTGC 
ACCAGGTAGC 
ATCAGTTTCT 
CCACCTCTCT 
AGTCTTTAGT 
TCCTAAAATT 
ACAGTAACTT 
TTTTTGCAAG 



TCTGTATTGA 



GAAATATAAC 
ATTTTACAAC 
CCCTAGTGTC 



AAATTTTTAG C 



TGCAGTATTC 



GAAATAATCT 



TCTTACTCTA 
GCATGTAATT 
GAGAATAACA 
AATATAAATA 



3600 
3660 
3720 



TCTGACAACC 
CAAGATGATC 
CTACAAAGTG 
GGACAGTAAT 
TTAACACAGA 



TCATGACATA 
ACTTGTAATT 
ATGGAATTGT 
TAGGAATTCC 
AGCAATTATC 
TCTTCCATAT 
GTTACTTATT 
AGATTTTTAT 



4440 
4500 

4520 
4680 



TGTGTTACCT AAGTCATTAA 
TGGAAAATAG AAATACCTTC ATTTTGAAAG 
CATTGTTCAA ATGGTTTTTA TCCAAGGAAT 
TTGCCATTAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 



3in sequence: 



MRILKRFLAC 
QSPINIDEDL 
FKASKITPHW 



I 



TDTVDWIVFK D' 



ESVSRFGKQA 
AVFCEVLTMQ 
DPENYTSLLV 
NMSYVLQIVA 



DRFSSFSEAV 



GSKTVLRSPH 
ENISQGYIFS 
TAQPDVGSGR 
TEVTPHAFTP 
LVIVSALTFI 
IKHFPKHVAD 
KLAQLAEKDG 



MNLSGTAESL 
SENPETITYD 
ESFLQTNYTE 
SSRQQDLVST 
CLWLVGILI 



NSTSQPVTKL 
NTVS ITEYEE 
VLIPESARNA 



KLTDYINANY 
DQYWPADGSE 
QWPDMGVPEY 
TVNIFGFLKH 
AGKTKLEKQF 
INASYIMGYY 
KDEPINCESF 
ISKTFELISV 
INLMRPGVFA 



VNWYSQTTQ 
YWRKCFQTAH 
FEEVQSCTVD 



EYGNFLVTQK 
SLPVLTFVRK 
IRSQRNYLVQ 
QLLSQSNIQQ 
QSNEFIITQH 



RVSGGVSEMV 120 

KGKGKLRALS 180 

YNGSLTSPPC 240 

KFSRQVFSSY 3 00 

QQLDGEDQTK 360 

DNPELDLFPE 420 

GTKYNEAKTN 480 

VEGTSASLND 540 

PATSAIPFIS S00 

NVWFPSSTDI 660 

PHYSTFAYFP .72 0 

LESEKKAVIP 780 

PISDDVGAIP 840 

LGITADSSNH PDNKHKNRYI NIVAYDHSRV 900 

IAAQGPLKST AEDFWRMIWE HNVEVIVMIT 960 
SVQVLAYYTV R 



QSGYVMLMDY LQNKFREQQY 
TWERPRWYD TMIEKFAVLY 
ICTNGLYGKY SDQLIVDMPT 
TNQIRKKEPQ ISTTTHYNRI 
ATEKDISLTS QTVTELPPHT 
ESLLTSFKLD 



QDFILEATQD 
PMIVHDEHGG VTAGTFCAL? 
VILSLVSTRQ 



NGAALPDGNI 



Coding sequence 
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CACACATACG CACGCACGAT CTCACTTCGA TCTATACACT GGAGGATTAA AACAAACAAA 60 

CAAAAAAAAC ATTTCCTTCG CTCCCCCTCC CTCTCCACTC TGAGAAGCAG AGGAGCCGCA 12 0 

CGGCGAGGGG CCGCAGACCG TCTGGAAATG CGAATCCTAA AGCGTTTCCT CGCTTGCATT 180 

CAGCTCCTCT GTGTTTGCCG CCTGGATTGG GCTAATGGAT ACTACAGACA ACAGAGAAAA 240 

CTTGTTGAAG AGATTGGCTG GTCCTATACA GGAGCACTGA ATCAAAAAAT TGGGGAAAGA 300 

AATATCCAAC ATGTAAT AG C CCAAAACAAT CTCCTATCAA TATTGATGAA GATCTTACAC 260 

AAGTAAATGT GAATCTTAAG AAACTTAAAT TTCAGGGTTG GGATAAAACA TCATTGGAAA 42 0 

ACACATTCAT TCATAACACT GGGAAAACAG TGGAAATTAA TCTCACTAAT GACTACCGTG 4 80 

TCAGCGGAGG AGTTTCAGAA ATGGTGTTTA AAGCAAGCAA GATAACTTTT CACTGGGGAA 540 

AATGCAATAT GTCATCTGAT GGATCAGAGC ATAGTTTAGA AGGACAAAAA TTTCCACTTG 600 

AGATGCAAAT CTACTGCTTT GATGCGGACC GATTTTCAAG TTTTGAGGAA GCAGTCAAAG 660 

GAAAAGGGAA GTTAAGAGCT TTATCCATTT TGTTTGAGGT TGGGACAGAA GAAAA7TTGG 72 0 

ATTTCAAAGC GATTATTGAT GGAGTCGAAA GTGTTAGTCG TTTTGGGAAG CAGGCTGCTT 780 

TAGATCCATT CATACTGTTG AACCTTCTGC CAAACTCAAC TGACAAGTAT TACATTTACA 840 

ATGGCTCATT GACATCTCCT CCCTGCACAG ACACAGTTGA CTGGATTGTT TTTAAAGATA 900 

CAGTTAGCAT CTCTGAAAGC CAGTTGGCTG TTTTTTGTGA AG7TCTTACA ATGCAACAAT 960 

CTGGTTATGT CATGCTGATG GACTACTTAC AAAACAATTT TCGAGAGCAA CAGTACAAGT 1020 

TCTCTAGACA GGTGTTTTCC TCATACACTG GAAAGGAAGA GATTCATGAA GCAGTTTGTA 1080 

GTTCAGAACC AGAAAATGTT CAGGCTGACC CAGAGAATTA TACCAGCCTT CTTGTTACAT 114 0 

GGGAAAGACC TCGAGTCGTT TATGATACCA TGATTGAGAA GTTTGCAGTT TTGTACCAGC 12 00 

AGTTGGATGG AGAGGACCAA ACCAAGCATG AATTTTTGAC AGATGGCTAT CAAGACTTGG 12 60 

GTGCTATTCT CAATAATTTG CTACCCAATA. TGAGTTATGT TCTTCAGAIA GTAGCCATAT 132 0 

GCACTAATGG CTTATATGGA AAATACAGCG ACCAACTGAT TGTCGACATG CCIACTGATA 1380 

ATCCTGAACT TGATCTTTTC CCTGAATTAA TTGGAACTGA AGAAATAAIC AAGGAGGAGG 1440 

AAGAGGGAAA AGACATTGAA GAAGGCGCTA TTGTGAATCC TGGTAGAGAC AGTGCTACAA 1500 

ACCAAATCAG GAAAAAGGAA CCCCAGATTT CTACCACAAC ACACTACAAT CGCATAGGGA 1560 

CGAAATACAA TGAAGCCAAG ACTAACCGAT CCCCAACAAG AGGAAGTGAA TTCTCTGGAA 1620 

AGGGTGATGT TCCCAATACA TCTTTAAATT CCACTTCCCA ACCAGTCACT AAATTAGCCA 1680 

CAGAAAAAGA TATTTCCTTG ACTTCTCAGA CTGTGACTGA ACTGCCACCT CACACTGTGG 1740 

AAGGTACTTC AGCCTCTTTA AATGATGGCT CTAAAACTGT TCTTAGATCT CCACATATGA 1800 

ACTTGTCGGG GACTGCAGAA TCCTTAAATA CAGTTTCTAT AACAGAATAT GAGGAGGAGA 1860 

GTTTATTGAC CAGTTTCAAG CTTGATACTG GAGCTGAAGA TTCTTCAGGC TCCAGTCCCG 1920 

CAACTTCTGC TATCCCATTC ATCTCTGAGA ACATATCCCA AGGGTATATA TTTTCCTCCG 1980 

AAAACCCAGA GACAATAACA TATGATGTCC TTATACCAGA ATCTGCTAGA AATGCTTCCG 2 040 

AAGATTCAAC TTCATCAGGT TCAGAAGAAT CACTAAAGGA TCCTTCTATG GAGGGAAATG 2100 

TGTGGTTTCC TAGCTCTACA GACATAACAG CACAGCCCGA TGTTGGATCA GGCAGAGAGA 2160 

GCTTTCTCCA GACTAATTAC ACTGAGATAC GTGTTGATGA ATCTGAGAAG ACAACCAAGT 2 220 

CCTTTTCTGC AGGCCCAGTG ATGTCACAGG GTCCCTCAGT TACAGATCTG GAAATGCCAC 22 80 

ATTATTCTAC CTTTGCCTAC TTCCCAACTG AGGTAACACC TCATGCTTTT ACCCCATCCT 2340 

CCAGACAACA GGATTTGGTC TCCACGGTCA ACGTGGTATA CTCGCAGACA ACCCAACCGG 2400 

TATACAATGA GGCCAG T AAT AGTAGCCATG AGICTCGTAT TGGTCTAGCT GAGGGGTTGG 2460 

AATCCGAGAA G AAGGC AG T T ATACCCCTTG TGATCGTGTC AGCCCTGACT TTTATCTGTC 2 52 0 

TAGTGGTTCT TGTGGGT AT T CTCATCTACT GGAGGAAATG CTTCCAGACT GCACACTTTT 2580 

A CAGTACATCC CCTAGAGTTA TATCCACACC TCCAACACCT ATCTTTCCAA 2 64 0 

A TGTCGGAGCA ATTCCAATAA AGCACTTTCC AAAGCATGTT GCAGATTTAC 2 700 



T T TCAGATGA TliTCUUH^vji. Ai iv.wu.inn ™*w»...— . . ~- — 

ATGCAAGTAG TGGGTTTACT GAAGAATTTG AGACACTGAA AGAGTTTTAC CAGGAAGTGC . 
AGAGCTGTAC TGTTGACTTA GGTATTACAG CAGACAGCTC CAACCACCCA GACAACAAGC 2 82 0 
ACAAGAATCG ATACATAAAT ATCGTTGCCT ATG AT CAT AG CAGGGTTAAG CTAGCACAGC 
TTGCTGAAAA GGATGGCAAA CTGACTGATT ATATCAATGC CAATTATGTT GATGGCTACA 



TTGCTGAAAA GGATGGCAAA CTGACTGATT 4KMIH. ™"'»""» """""^ 

ACAGACCAAA AGCTTATATT GCTGCCCAAG GCCCACTGAA ATCCACAGCT GAAGATTTCT 3000 

GGAGAATGAT ATGGGAACA1 AATGTGGAAG TTATTGTCAT GATAACAAAC CTCGTGGAGA 3060 

AAGGAAGGAG AAAATGTGAT CAGTACTGGC C1GCCGATGG GAGTGAGGAG TACGGGAACT 3120 

TTCTGGTCAC TCAGAAGAGT GTGCAAGTGC TTGCCTATTA TACTGTGAGG AATTTTACTC 3180 



3300 



TTCTGGTCAC TCAGAAGAUX UlbLMUl^ IJVj^JmjH inv,i ul »™v 

TAAGAAACAC AAAAATAAAA AAGGGCTCCC AGAAAGGAAG ACCCAGTGGA CGTGTGGTCA 
CACAGTATCA CTACACGCAG TGGCCTGACA TGGGAGTACC AGAGTACTCC CTGCCAGTGC 
TGACCTTTGT GAGAAAGGCA GCCTATGCCA AGCGCCATGC AGTGGGGCCT GTTGTCGTCC 

ACTGCAGTGC TGGAGTTGGA AGAACAGGCA CATATATTGT GCTAGACAGT ATGTTGCAGC 342 0 

AGATTCAACA CGAAGGAACT GTCAACATAT TTGGCTTCTT AAAACACATC CGTTCACAAA 34 80 

GAAATTATTT GGTACAAACT GAGGAGCAAT ATGTCTTCAT TCATGATACA CTGGTTGAGG 3540 

CCATACTTAG TAAAGAAACT GAGGTGCTGG ACAGTCATAT TCATGCCTAT GTIAATGCAC 36 00 

TCCTCATTCC TGGACCAGCA GGCAAAACAA AGCTAGAGAA ACAATTCCAG CTCCTGAGCC 3 660 

AGTCAAATAT ACAGCAGAGT GACTATTCTG CAGCCCTAAA GCAATGCAAC AGGGAAAAGA 3 72 0 

ATCGAACTTC TTCTATCATC CCTGTGGAAA GATCAAGGGT TGGCATTTCA TCCCTGAGTG 3780 

GAGAAGGCAC AGACIACATC AATGCCTCCT ATATCATGGG CTATTACCAG AGCAATGAAT 3 840 

TCATCATTAC CCAGCACCCT CTCCTTCATA CCATCAAGGA TTTCTGGAGG ATGATATGGG 3 900 

ACCATAATGC CCAACTGGTG GTTATGATTC CTGATGGCCA AAACATGGCA GAAGATGAAT 3 960 

TTGTTTACTG GCCAAATAAA GATGAGCCTA TAAATTGTGA GAGCTTTAAG GTCACTCTTA 4020 

TGGCTGAAGA ACACAAATGT CTATCTAATG AGGAAAAACT TATAATTCAG GACTTTATCT 4 080 

TAGAAGCTAC ACAGGATGAT TATGTACTTG AAGTGAGGCA CTTTCAGTGT CCTAAATGGC 4140 

CAAATCCAGA TAGCCCCATT AGTAAAACT T TTGAACTTAT AAGTGTTATA AAAGAAGAAG 42 00 

CTGCCAATAG GGA1GGGCCT ATGATTGTTC ATGATGAGCA TGGAGGAGTG ACGGCAGGAA 42 60 

CTTTCTGTGC TCTGACAACC CTTATGCACC AACTAGAAAA AGAAAATTCC GTGGATGTTT 4320 

ACCAGGTAGC CAAGATGATC AATCTGATGA GGCCAGGAGT CTTTGCTGAC ATTGAGCAGT 4380 

ATCAGTTTCT CTACAAAGTG ATCCTCAGCC TTGTGAGCAC AAGGCAGGAA GAGAATCCAT 4440 

CCACCTCTCT GGACAG T AAT GGTGCAGCAT TGCCTGATGG AAATATAGCT GAGAGCTTAG 4500 

AGTCTTTAGT TTAACACAGA AAGGGGTGGG GGGACTCACA TCTGAGCATT GTTTTCCTCT 4560 

TCCTAAAATT AGGCAGGAAA ATCAGTCTAG TTCTGTTATC TGTTGATTTC CCATCACCTG 4620 

ACAGTAACTT TCATGACATA GGATTCTGCC GCCAAATTTA TATCATTAAC AATGTGTGCC 4 680 

TTTTTGCAAG ACTTGTAATT TACTTATTAT GTTTGAACTA AAATGATTGA ATTTTACAGT 4740 

ATTTCTAAGA ATGGAATTGT GGTATTTTTT TCTGTATTGA TTTTAACAGA AAATTTCAAT 4 80 0 

TTATAGAGGT TAGGAATTCC AAACTACAGA AAATGTTTGT TTTTAGTGTC AAATTTTTAG 4860 

CTGTATTTGT AGCAATTATC AGGTTTGCTA GAAATATAAC TTTTAATACA GTAGCCTGTA 492 0 

AATAAAACAC TCTTCCATAT GATATTCAAC ATTTTACAAC TGCAGIATTC ACCTAAAGTA 4980 

GAAATAATCT GTTACTTATT GTAAATACTG CCCTAGTGTC TCCATGGACC AAATTTATAT 50«<: 



GAAATAATCT GTTACTTATT GTAAATACTG CCCTAUTUTC itunujA^ ««„±±i,»i«x 
TTATAATTGT AGATTTTTAT ATTTTACTAC TGAGTCAAGT TTTCTAGTTC TGTGTAATTG 5100 
TTTAGTTTAA TGACGTAGTT CATTAGCTGG TCTTACTCTA CCAGTTTTCT GACATTGTAT 5160 
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TGTGTTACCT AAGTCATTAA CTTTGTTTCA GCATGTAATT TTAACTTTTG TGGAAAAIAG 
AAATACCTTC ATTTTGAAAG AAGTTTTTAT GAGAATAACA CCTTACCAAA CATTGTTCAA 
ATGGTTTTTA TCCAAGGAAT TGCAAAAATA AATATAAATA TTGCCATTAA A 
A AAAAAAAAAA AAA 



PCT/US02/12476 



21 



PCTDTVDWIV 
TKHEPLTDGY 
TNRSPTRGSE 



GSEHSLEGQK FPLEMQIYCF 
GVESVSRFGK QAALDPFILL 
FKDTVSISES QLAVFCEVLT MQQSGYVMLM 
QADPENYTSL LVTWERPRW 
LPNMSYVLQI VAICTNGLYG 
EGAIVNPGRD SATNQIRKKE 
SLNSTSQPVT 



DADRFSSFEE 



QDLGAILNNL 
KEEEEGKDIE 
FSGKGDVPNT 



DYLQNMFREQ 
YDTMIEKFAV 
KYSDQLIVDM 
PQISTTTHYN 
TSQTVTELPP 



I 

■ AVKGKGKLRA 
YIYNGSLTSP 
QYKFSRQVFS 
LYQQLDGEDQ 
PTDNPELDLF 
RIGTKYNEAK 



SSPATSAIPF 480 



S GRESFLQTNY 



YDVLIPESAR 



IPLVIVSALT 



NVEVIVMITN 



EVLDSHIHAY 
PVERSRVGIS 
VMIPDGQNMA 
YVLEVRHFQC 
LMHQLEKENS 



RWTQYHYTQ W 
MLQQIQHEGT V 
VNALLIPGPA G 
SLSGEGTDYI N 



QEVQSCTVDI, 
YGNFLVTQKS 



I RSQRNYLVQT 
3 LLSQSNIQQS 
3 SNEFIITQHP 



PKWPNPDSPI S 
VDVYQVAKMI N 
ESLESLV 



Seq ID NO; 187 D 
Nucleic Acid Accession f> : EOS sequence 
Coding sequence: 148-4632 



DNKHKNRYIN 
AAQGPLKSTA E 
VQVLAYYTVR N 
AYAKRHAVGP V 
EEQYVFIHDT LVEAILSKET 
DYSAALKQCN REKNRTSSII 
LLHTIKDFWR MIWDHNAQLV 
LSNEEKLIIQ DFILEATQDD 
MIVHDEHGGV TAGTFCALTT 
ILSLVSTRQE ENPSTSLDSN 



CTTGTTGAAG 



I 

3 CACGCACGAT 
: ATTTCCTTCG 
CCGCAGACCG 
GTGTTTGCCG 
AGATTGGCTG 
CATGTAATAG 
TGAATCTTAA 
TTCATAACAC 



AAATG CAAT A 

GGAAAAGGGA 
GATTTCAAAG 



TGTCATCTGA 
TCTACTGCTT 
AGTTAAGAGC 



I 

" CTCACTTCGA 
CTCCCCCTCC 
TCTGG AAATG 
CCTGGATTGG 
GTCCTATACA 
CCCAAAACAA 
GAAACTTAAA 
TGGGAAAACA 
AATGGTGTTT 
TGGATCAGAG 
TGATGCGGAC 
TTTATCCATT 



TCTGGTTATG 



AGTTCAGAAC 
TGGGAAAGAC 
CAGTTGGATG 



TCATACTGTT 
TCTCTGAAAG 
AGGTGTTTTC 
CTCGAGTCGT 



I 

• TCTATACACT 
CTGTCCACTC 
CGAATCCTAA 
GCTAATGGAT 
GGAGCACTGA 
TCTCCTATCA 
TTTCAGGGTT 
GTGGAAAITA 
AAAGCAAGCA 
CATAGTTTAG 
CGATTTTCAA 
TTGTTTGAGG 
AGTGTTAGTC 
CCAAACTCAA 
GACACAGTTG 



41 

I 

GGAGGATTAA 
TGAGAAGCAG 
AACGTTTCCT 
ACTACAGACA 
ATCAAAAAAA 
ATATTGATGA 
GGGATAAAAC 
ATCTCACTAA 
AGATAACTTT 
AAGGACAAAA 
GTTTTGAGGA 
TTGGGACAGA 



AACAAACAAA 
AGGAGCCGCA 
CGCTTGCATT 
ACAGAGAAAA 
TTGGGGAAAG 
AGATCTTACA 



TGACTACCGT 
TCACTGGGGA 

ATTTCCACTT 



AGAAAATTTG 72 0 



GGACTACTTA 
CTCATACACT GGAAAGGAAG 
TCAGGCTGAC CCAGAGAATT 
TTATGATACC ATGATTGAGA 
AACCAAGCAT GAATTTTTGA 
ATGAGTTATG 
GACCAACTGA 
ATTGGAACTG 



TTCGAGAGCA 
AGATTCATGA 
ATACCAGCCT 
AGTTTGCAGT 
CAGATGGCTA 



TTACATTTAC 
TTTTAAAGAT 
AATGCAACAA 
ACAGTACAAG 
AGCAGTTTGT 



TTTGTACCAG 



TTGTCGACAT 



AGTAGCCATA 
CAAGGAGGAG 



AACCAAATCA 
ACGAAATACA ATGAAGCCAA 
AAGGGTGATG TTCCCAATAC 
ACAGAAAAAG ATATTTCCTT 
.GAAGGTACTT CAGCCTCTTT 



AGTTTATTGA 



AGCTTTCTCC 
TCCTTTTCTG 
CATTATTCTA 
TCCAGACAAC 
GTATACAATG 
GAATCCGAGA 
CTAGTGGTTC 



ATTTCAGATG 
CATGCAAGTA 
CAGAGCTGTA 



CCAGTTTCAA 
CTATCCCATT 
AGACAATAAC 
CTTCATCAGG 
CTAGCTCTAC 
AGACTAATTA 
CAGGCCCAGT 
CCTTTGCCTA 
AGGATTTGGT 
AGGCCAG T AA 
AGAAGGCAGT 
TTGTGGGTAT 
ACAGTAC AT C 
ATGTCGGAGC 
GTGGGTTTAC 
CTGTTGACTT 



GACTTCTCAG 
AAATGATGGC 
ATCCTTAAAT 
GCTTGATACT 
CATCTCTGAG 
ATATGATGTC 
TTCAGAAGAA 
AGACATAACA 
CACTGAGATA 



TAACAGAATA 
ATTCTTCAGG 
AAGGGTATAT 
AATCTC-CTAG 



CTTCCCAACT 
CTCCACGGTC 
TAGTAGCCAT 
TATACCCCTT 
TCTCATCTAC 
CCCTAGAGTT 
AATTCCAATA 
TGAAGAATTT 
AGGTATTACA 



TCCACTTCCC AACCAGTCAC 
ACTGTGACTG 
TCTAAAACTG 
ACAGTTTCTA 
GGAGCTGAAG 
AACATATCCC 
CTTATACCAG 
TCACTAAAGG 

ATGTTGGATC 
AATCTGAGAA 
TTACAGATCT 
CTCATGCTTT 
ACTCGCAGAC 
TTGGTCTAGC 
CAGCCCTGAC 
GCTTCCAGAC 
CTCCAACACC 



GAGGTAACAC 



GAGTCTCGTA 
GTGATCGTGT 
TGGAGGAAAT 
ATATCCACAC 
AAGCACTTTC 
GAGACACTGA 
GCAGACAGCT 



TCGCATAGGG 
ATTCTCTGGA 
TAAATTAGCC 
TCACACTGTG 
TCCACATATG 
TGAGGAGGAG 
CTCCAGTCCC 
ATTTTCCTCC 
AAATGCTTCC 
GGAGGGAAAT 
AGGCAGAGAG 
GACAACCAAG 
GGAAATGCCA 
TACCCCATCC 2 340 
AACCCAACCG 2400 



TTTTATCTGT 2S2 0 

TGCACACTTT 25 30 

TATCTTTCCA 2 640 

TGCAGATTTA 2700 

CCAGGAAGTG 2 7 60 

C AGACAACAAG 2 820 



262 



WO 02/086443 PCT/US02/12476 

CACAAGAATC GATACATAAA TATCGTTGCC TATGATCATA GCAGGGTTAA C-CTAC-CACAG 2880 

CTTGCTGAAA AGGATGGCAA ACTGACTGAT TATATCAAIG CCAATTATGT TGATGGCTAC 2940 

AACAGACCAA AAGCTTATAT TGCTGCCCAA GGCCCACTGA AATCCACAGC TGAAGATTTC 3000 

TGGAGAATGA TATGGGAACA TAATGTGGAA GTTATTGTCA TGATAACAAA CCTCGTGGAG 3060 

AAAGGAAGGA GAAAATGTGA TCAGTACTGG CCTGCCGATG GGAGTGAGGA GTACGGGAAC 312 0 

TTTCTGGTCA CTCAGAAGAG TGTGCAAGTG CTTGCCTATT ATACTGTGAG GAATTTTACT 3180 

CTAAGAAACA CAAAAATAAA AAAGGGCTCC CAGAAAGGAA GACCCAC-TGG ACGTGTGGTC 3240 

ACACAGTATC ACTACACGCA GTGGCCTGAC ATGGGAGTAC CAGAGTACTC CCTGCCAGTG 3300 

CTGACCTTTG TGAGAAAGGC AGCCTATGCC AAGCGCCATG CAGTGGGGCC TGTTGTCGTC 3360 

CACTGCAGTG CTGGAGTTGG AAGAACAGGC ACATATATTG TGCTAGACAG TATGTTGCAG 3420 

CAGATTCAAC ACGAAGGAAC TGTCAACATA TTTGGCTTCT TAAAACACAT CCGTTCACAA 3480 

AGAAATTATT TGGTACAAAC TGAGGAGCAA TATGTCTTCA TTCATGATAC ACTGGTTGAG 3540 

GCCATACTTA GTAAAGAAAC TGAGGTGCTG GACAGTCATA TTCATGCCTA TGTTAATGCA 3600 

CTCCTCA1TC CTGGACCAGC AGGCAAAACA AAGCTAGAGA AACAATTCCA GGGTCTCACT 3660 

CTGTCACCCA GGCTGGAGTG CAGAGGCACA ATCTCGGCTC ACTGCAACCT TCCICTCCCT 372 0 

GGCTTAACTG ATCCTCCTAC CTCAGCCTCC CGAGTGGCTG GGACTATACT CCTGAGCCAG 3780 

TCAAATATAC AGCAGAGTGA CTATTCTGCA GCCCTAAAGC AATGCAACAG GGAAAAGAAT 3840 

CGAACTTCTT CTATCATCCC TGTGGAAAGA TCAAGGGTTG GCATTTCATC CCTGAGTGGA 3900 

GAAGGCACAG ACTACATCAA TGCCTCCTAT ATCATGGGCT AITACCAGAG CAATGAATTC 3960 

ATCATTACCC AGCACCCTCT CCTTCATACC ATCAAGGATT TCTGGAGGAT GATATGGGAC 4020 

CATAATGCCC AACTGGTGGT TATGATTCCT GATGGCCAAA ACATGGCAGA AGATGAATTT 4080 

GTTTACTGGC CAAATAAAGA TGAGCCTATA AATTGTGAGA GCTTTAAGGT CACICTTATG 4140 

GCTGAAGAAC ACAAATGTCT ATCTAATGAG GAAAAACTTA TAATTCAGGA CTTTATCTTA 4200 

GAAGCTACAC AGGATGATTA TGTACTTGAA GTGAGGCACT TTCAGIGTCC TAAATGGCCA 42 60 

AATCCAGATA GCCCCATTAG TAAAACTTTT GAACTTATAA GTGTTATAAA AGAAGAAGCT 4320 

GCCAATAGGG ATGGGCCTAT GATTGTTCAT GATGAGCATG GAGGAGTGAC GGCAGGAACT 43 80 

TTCTGTGCTC TGACAACCCT TATGCACCAA CTAGAAAAAG AAAATTCCGT GGATGTTTAC 4440 

CAGGTAGCCA AGATGATCAA TCTGATGAGG CCAGGAGTCT TTGCTGACAT TGAGCAGTAT 4500 

CAGTTTCTCT ACAAAGTGAT CCTCAGCCTT GTGGGCACAA GGCAGGAAGA GAATCCATCC 456 0 

ACCTCTCTGG ACAGTAATGG TGCAGCATTG CCTGATGGAA ATATAGCTGA GAGCTTAGAG 4 62 0 

TCTTTAGTTT AACACAGAAA GGGGTGGGGG GACTCACATC TGAGCATTGI TTTCCTCTTC 4680 

CTAAAATTAG GCAGGAAAAT CAGTCTAGTT CTGTTATCTG TTGATTTCCC ATCACCTGAC 4740 

AGTAACTTTC ATGACATAGG ATTCTGCCGC CAAATTTATA TCATTAACAA TGTGTGCCTT 4800 

TTTGCAAGAC TTGTAATTTA CTTATTATGT TTGAACTAAA ATGATTGAAT TTTACAGTAT 4 860 

TTCTAAGAAT GGAATTGTGG TATTTTTTTC TGTATTGATT TTAACAGAAA ATTTCAATTT 4 920 

ATAGAGGTTA GGAATTCCAA ACTACAGAAA ATGTTTGTTT TTAGTGTCAA ATTTTTAGCT 4980 

GTATTTGTAG CAATTATCAG GTTTGCTAGA AATATAACTT TTAATACAGT AGCCTGTAAA 5040 

TAAAACACTC TTCCATATGA TATTCAACAT TTTACAACTG CAGTATTCAC CTAAAGTAGA 5100 

AATAATCTGT TACTTATTGT AAATACTGCC CTAGTGTCTC CATGGACCAA ATTTATATTT 5160 

ATAATTGTAG ATTTTTATAT TTTACTACTG AGTCAAGTTT TCTAGTTCTG TGTAATTGTT 5220 

TAGTTTAATG ACGTAGTTCA TTAGCTGGTC TTACTCTACC AGTTTTCTGA CAITGTATTG 52 80 

TGTTACCTAA GTCATTAACT TTGTTTCAGC ATGTAATTTT AACTTTTGTG GAAAATAGAA 5340 

ATACCTTCAT TTTGAAAGAA GTTTTTATGA GAATAACACC TTACCAAACA TTGTTCAAAT 5400 

GGTTTTTATC CAAGGAATTG CAAAAATAAA TATAAATATT GCCATTAAAA AAAAAAAAAA 5460 
AAAAAAAAAA AAAAAAAAAA A 

Seq ID NO: 188 Protein sequence: 
Protein Accession EOS sequence 

1 11 21 31 41 51 

I I I I I I 

MRILKRFLAC IQLLCVCRLD WANGYYRQQR KLVEEIGWSY TGALNQKNWG KXYPTCNSPK 60 

QSPINIDEDL TQVNVNLKKL KFQGWDKTSL ENTFIHNTGK TVEINLTNDY RVSGGVSEMV 120 

FKASKITFHW GKCNMSSDGS EHSLEGQKFP LEMQIYCFDA DRFSSFEEAV KGKGKLRALS 180 

ILFEVGTEEN LDFKAIIDGV ESVSRFGKQA ALDPFILLNL LPNSTDKYYI YNGSLTSPPC 240 

TDTVDWIVFK DTVSISESQL AVFCEVLTMQ QSGYVMLMDY LQNNFREQQY KFSRQVFSSY 300 

TGKEEIHEAV CSSEPENVQA DPENYTSLLV TWERPRWYD TMIEKFAVLY QQLDGEDQTK 3 60 

HEFLTDGYQD LGAILNNLLP NMSYVLQIVA ICTNGLYGKY EDQLIVDMPT DNPELDLFPE 420 

LIGTEEIIKE EEEGKDIEEG AIVNPGRDSA TNQIRKKEPQ ISTTTHYNRI GTKYNEAKTN 4B0 

RSPTRGSEFS GKGDVPNTSL NSTSQPVTKL ATEKDISLTS QTVTELPPHT VEGTSASLND 540 

GSKTVLRSPH MNLSGTAESL NTVSITEYEE ESLLTSFKLD TGAEDSSGSS PATSAIPFIS 600 

ENISQGYIFS SENPETITYD VLIPESARNA SEDSTSSGSE ESLKDPSMEG NVWFPSETDI 660 , 

TAQPDVGSGR ESFLQTNYTE IRVDESEKTT KSFSAGFVMS QGPSVTDLEM PHYSTFAYFP 720 

TEVTPHAFTP SSRQQDLVST VNWYSQTTQ PVYNEASNSS HESRIGLAEG LESEKKAVIP 780 

LVIVSALTFI CLWLVGILI YWRKCFQTAH FYLEDSTSPR VISTPPTPIF PISDDVGAIP 340 

IKHFPKHVAD LHASSGFTEE FETLKEFYQE VQSCTVDLGI TADSSNHPDN KHKKRYINIV 900 

AYDHSRVKLA QLAEKDGKLT DYINANYVDG YNRPKAYIAA QGPLKSTAED FWRMIWEHNV 960 

EVIVMITNLV EKGRRKCDQY WPADGSEEYG NFLVTQKSVQ VLAYYTVRNF TLRNTKIKKG 1020 

"SQKGRPSGRV VTQYHYTQWP DMGVPEYSLP VLTFVRKAAY AKRHAVGPW VHCSAGVGRT 1080 

GTYIVLDSML QQIQHEGTVN IFGFLKHIRS QRNYLVQTEE QYVFIHDTLV EAILSKETEV 1140 

LDSHIHAYVN ALLI PGPAGK TKLEKQFQGL TLSPRLECRG T-SAKCNLPL PGLTDPPTSA 1200 

SRVAGTILLS QSNIQQSDYS AALKQCNREK NHTSSIIPVE RSRVGISSLS GEGTDYINAS 1260 

YIMGYYQSNE FIITQHPLLH TIKDFWRMIW DHNAQLWMI PDG2NMAEDE FVYWPNKDEP 1320 

INCESFKVTL MAEEHKCLSN EEKLIIQDFI LEATQDDYVL EVRHFQCPKW PNPDSPISKT 1380 

FELISVIKEE AANRDGPMIV HDEHGGVTAG TFCALTTLWH QLSKENSVDV YOVAKKINLM 1440 
RPGVFADIEQ YQFLYKVILS LVGTRQEENP STSLDSNGAA LPDGNIAESL ESLV 



CCGGTTCGCA AAGAAGCTGA CTTCAGAGGG GGAAACTTTC TTCTTTTAGG AGGCGGTTAC- 
CCCTGTTCCA CGAACCCAGG AGAACTGCTG GCCAGATTAA TTAC-ACATTG CTATGGGAGA 
CGTGTAAACA CACTACTTAT CATTGATGCA TATATAAAAC CATTTTATTT TCGCTATTAT 
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TTCAGAGGAA GCGCCTCTGA 
GTTTGGAGAA AGCACAGTTG 
ACGATGCAGC GGAGACTGGT 
GTGCCCTCCT GCGGGCGCTC 
GAACATCAGC TCCTCCATGA 
CTTCACCATC TGATCGCAGA 
CCTAACTCCA AGCCCTCTCC 
GAGGGCAGAT ACCTAACTCA 
AAGACACCTG GGAAGAAAAA 
AAACGGCGAA CTCGCTCTGC 
GACCACCTGT CTGACACCTC 
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TTTGTTTCTT T 
GAGTAGCCGG T 
TCAGCAGTGG A 



GCTGAAATCA 
AACCACCCCG 
AAGGTGGAGA 



TCCGATTTGC- GTCIGATC-AT 
CGTACAAAGA GCAGCCGCTC 
GCAAGGAGCA GGAAAAGAAA 



CATCAATCCT 

ATCTTCATAA TTTGCTGGAG AAGTGTATTT 
TTCTTCAGTG TTTTTCATTT CTTACGTTCT 
GATATTATCT ACAAACACTG CAGAACAGCA 
ACTTTTTATT TAATTAAATG TATTTAATTA 
TAAA1TATGT TTTAAACACA TGCCTTAAAT 
CCAGCTCATA CABAATAAAT GGTTTCTGAA 
GGTTTTTCTC ATGTATCTTT TTGTTCATTG 
CCGTAGGAAA AATAAAACTT CACATTTAAA 



C TCTGGAGTGA 

ATTCACGGTA ACAGGCTTCT 
GGAGCCTCCC TTCTGCCTTG 
ATCGATTGTG TAGCAATTGA 
CCCCCTACCA CACACACCCC 
CTTTCTCCAC CGTCACCCAA 
AGCTTCAGAA GCTAGTGACC 
CTCTCACACC TGGC-CAAACT 
GGGAGAATAT A 
TCATGTCATA AACGATTCTG 
AATCTGAAAT TTATTTTAAT 
TTGTTTAATT AAATTTAACT CTGGTTTCTA 
AATGTTTAAG TATTAACTTA CAAGGATATA 
GCAAGATGAA ATAATTTTTC TAGGGTAATG 



GCCCATTCCT 
CTTCCCCTTA 



MQRRLVQQWS VAVFLLSYAV PSCGRSVEGL SRRLKRAVSE HOLLHDKGKS IQDLRRRFPL 
HHLIAEIHTA EIRATSEVSP NSKPSPNTKN HPVRPGSDDE GRYLTQETNK VETYKEQPLK 
ITRSAWLDS GVTGSGLEGD HLSDTSTTSL ELDSR 




CGGCTGTCCG 
CCTCGCATGC 
GGTATCGTGG 
GCGGCCACGG 
GCCAACCTGT 
GGCCCGGAAG 
GTGGATTTGC 
CTGGGCAGGG 
TGCCAGGTGT 



3 TGGGCCTGAG 



GCGCACCCCG 
TTCTCTTGCT 
GCCCAGCTTG 



1GACCCCCTA 
TTAGTCCTGG 
GGACACTGCC 
AGCCTTCTTG 



CTATATTAAT 



AGAGCGCGGC 
CCGAGGGCCG 

GCTTCTTCCT TGGCAAGATG 
CTCAGGTGCG GGAGGAGCTC 
CCCCCACGCA CGCGGACGGG 
TCGCCGAGGC GCTGCAGGCC 
TGGGTGGCTG CACTTGGCTG 



CACTTGCGGC 
AGGTACCCTA 
TGTGCCTCCC 
GCTGCATGAG 
CGTGCAGCTT 
TGAGCCCACT 
AGCACTAATC 
AGCTGGGACC 
TCAGGTCCTC 
GGCCTAGCCT 
ATGCAAACAC 
GTGTCTTTC 



GC1ACCCCAG 
CTTGGGAGCG 
CCCAGGATGG 
AGGTCCCCTG 
CAGACAACCA 



ACCTCTGGGC 
GCTGCAGGCA 
CCATGTTGCA 
AAAATAACGT 



ATCGCTGCTC 
AGGCGGTGGC GGCCGGAGAC 
TAAGCTGCTT CCGGGAGCTG 
CACCAGCACG TGCACGTGCT CCCAGGCGTG 
GCTTTACGCG ACTGCCGCTG 
CGCGTGCCTT CGCCTGCGCC 
CCCTTCTCCC GCCACGGCCT G 
CGGCACATGT CCGCTCACCG C 



GCGGTGAAGG CCCCGACGCT 
CTGCGCGTCC TCACCGCGCC CACGCTGCGG 
TGCGCCCTCG ACGACCTGGA CTCCAAGAGG 
CTGGAACCCT TCCTGGAACC CTCCCTACTC 
CCCTTAGTAC CAAGAAAGGG GAGCCAGGAT 
TGGAGCACGA TCTGTTGACT TCCCTGGGTA 
ATGCCTCCAA ATGGCATCTA GAGTTTGAGC 
GTGGCAGCGG GCTAGGGCCC GCAGAGCATT 
CTTCACCACT GGGGCAGTGG GGAGAGATGG 



Protein Accession U 



I I I I I I 

MSRPRMRLW TADDPGYCPR RDEGIVEAFL AGAVTSVSLL VNGAATESAA ELARRHSIPT 
GLHANLSEGR PVGPARRGAS SLLGPEGFFL GKKGFREAVA AGDVDLPQVR EELEAQLSCF 
RELLGRAPTH ADGHQHVHVL PGVCQVFAEA LQAYGVRFTR LPLERGVGGC TWLEAPARAF 
ACAVERDARA AVGPFSRHGL RWTDAFVGLS TCGRHMSAHR VSGALARVLE GTLAGHTLTA 
ELMAHPGYP.S VPPTGGCGEG PDAFSCSWER LHELRVLTAP TLRAQLAQDG VQLCALDDLD 
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AGAAGATGAA GGATATCGAC 
GTGTGAGGGA GAGAACCAGC 
GGAGAACTCG ACCGTTGGAA 
TCTCTCTTGA TGCCTCCATG 
A TCATGGCTTG 
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10 
15 



TCATCCTGTC CATCGTGTGC 



CATTCTCAGC TCAGAATCCT GGATGAGGAG C 
AGTGCTCTGA AGCCCATCCG GACTACTTCC AAACACCAGC 
CTTTTTTCCT GTATGACTTT TTCGTGGCTT TCTTCTCTGG 
GGGGAGCTCT CAATGGAAGA CGTGTGGTCT CTGTCCAAGC 
TGCAGAAGAC TAGAGAGACT GTGGCAAGAA GAGCTGAATG 
TCCCTGCGAA GGGTTGTGTG GATCTTCTGC CGCACCAGGC 



GAGTATACCC A 



TTAAGAAGAT 



GAGGACCCGT TGTTGCCATC 
GCTTCCTGGG ATCAGCTGTT 
TCACAGCATA 



ACCGGTGTCC 
AAGAACATTA 
AGAATGTTTG 



AAGAGAAATC 
AGGCAGCAGC 
TTTATAATGT 
TTTACCCAGC 
CCGCCACGGA 



GTCTAACCTG 
GTCTTGGTCG 
GGCCATCCTA 
CCTGGGTGAG 
CGTTGGCAGC 



CAGTACAGCT 780 



AGAGTGTTCA 
AGGGTATCAC 
CTGTTCATAT 
TCTTCAATTC 
AAGCCTCAGT 
TAAAGAACAA 
GGGACTCCTC 
ACAAGAGGGC 
AGGCGGTGCT 
CCGAAGAGGA 
ACAGCATCGA 
GTGGAAAAAC 
TTGCAATCAG 



TGTGGGTGTG 



CATGACTTTT 
GGCTGTTGAC 
ACCAGCCAGT 
CCACTCCAGT 
TTCCAGGGGC 



GAGGAGGAGC 
GCTCCCATTG 
TTCGATCTGA 
GCTTTGAAAG 
AGATTTAAGA 



GTCGGATATT 
TGGTGGTGAT 
CAGCAGCACA 
TAACACCGTT 
GTTTGTTTGT 



AATGATGTTT 
TGAACGTGTC 
CTGGGTCAAA 
GGAAAAAGCC 
TGCCAGCGTG 
GGCTTTCACA 



ACCATGGCAT 
CTCATCAACA 
CTGCTGGCTC- 
GGACCAACAG 
GCATCACGGC 
CAGAAGATGA 
GCATTTTCTC 
GGGTACTTCC 



AGAAGGCAAG 
TCTGGAGATC 
CTCTCTCATT 
TGGAACCTTC 
CATCCTGTTT 



T CGCCCAAGCT 
A AGGTGAGGCA 
C TCCTCCTGGA 



ATCCAGAACT 
AAGAAAGAGA 
AAAGGCCACC 
CACATCCACC 
CAAGAGGGTA AACTGGTTGG 
TCAGCCATTT 
GGTTATGTGG 
GGGAAGGAAT 



AATGGAAGAG 
GAAAAATGCC 
GACCCCCAAA 
GCTGCAGCGC 



TCCCTCTCAG 



GAGAGCGAGG 
TGTATAGTGA 
TGGGCAACCA 
TTGTTACCCA 
GCTGTATTAC 



AGGAAAAAGC 



AGCCAACCTG 
GAGGAG CATC 
CATCTTCAAT 
CCAGTTACAG 
GGAAAGAGGC 
TAACCTGTTG 
TTCACAGAAG 



CCCAGCAGGC 
ATGATGAAGA 
GACCTGGCCA TTCTTCCCAG 
AGCGCCAGAG 
ACGACCCCCT 
GGAAACATCT 



GCGCTTACAG 
AAICTGCGGC 
GACGCTICTA 
CTGGATCCTC 
AAGATAC 



ACCTTGGCAT 
ATGAAAAAAG 
ACTGAGCATC 
CGGCCCAGTC 
AGGACACTGC 



TACATCCTGG A 



ACCCATGAGG 
CTGGGAGAGA 
AAGTCACAAG 
GAGGAAGGCC 



CAGCGACCTG ACGGAGATTG 
GATCAGCCTT 
CAGTGCCTTA 
CAAGTCCAAG 
AGTGATCTTC 
TTTAAATGGT 
TGAGATCAAT 
TAAAACAGGA 
GCTGGAAGAG 
TGCTGGGGGC 



AGGGCCT TGC 
AGCTGCTGGA 
CTGTGCGGCT 
TTATGCACGG 
TAACGGGGCT 



TGACAACCAA 



GTGATATTGG 
TGTTCAGTGG 
TTTGGGATGC 
TTGAATCTGA 
GCATAGCTAG 
CCATGGACAC 
GTACCATGCT 
TGCTGGCCCA 



ATTCCATATT 
TCTATATATA 
TTGCTGTACT 



CAGGAT CAGG 
GCTGCATCAA 
AACTCTCTAT 
ACCCCTTCAA 



ATAGTGGGCC 
GAGACGGGTG 
CTGTCCTGGT 



AAGGGCAGGA 
TTTTGTTTAC 
TCATCACCAC 

GCAGAITCCC 
GTTCCAGTTT 
GATCAATCAC 
TCCCTCCCCI 
CCGAGAAAAC CTCCCTCTTG TCCTAAAGAA 
GATTGGCATT GTGGGGCGGA 
TCTGGTGGAG TTATCTGGAG 
CCTTGCCGAC CTCCGAAGCA 
CACTGTCAGA TCAAATTTGG 
ACACACATGA 
AATGGGGATA 
CGCCACTGTA 
TTATTGATTC 
CATCGCCTGC 
GTGGAGTTTG 
TTTGCTGCTG 
TCTTTTCTTT 
CCGAAACCTT 
TTTCACTTTT 

CATGTAAACA 
ATTATAATTG 
ATTCTGTACA 
AGCACTGTGC 
AGAGATCTGG 
TGGTTTCACG 
CTCCGACAGC 
GGCGGCTGGA 
GTCACTTACT 
TCCATCAAGA 



AGTGATGGAG 
AGCCCTGCTC 
AGAGACAGAC 
GACCATTGCC 
GGGACAGGTG 



ACTTCTCAGT 
AGATTCTGAT 
AAGAGACCAT 
ACACGGTTCT 
ACACCCCA1C 
CAGAGAACAA 
AGAGCATTGC 
GCCTTTCTCG 
AGGGAGAGIC 



GTGTGCGATG 
CACGGGGCTG 
CATCTCTTAT 
GACAGAAGCT 
GGAAGCACCT 
GGTGACCTTT 
AGTATCCTTC 
GAAGTCCTCG 
GATTGATGGA 
CATTCCTCAA 
CCAGTACACT 
TGCTCAGCTA 
GGGGGAACGG 
TTTAGATGAA 
CCGAGAAGCA 
AGGCTCCGAT 
GGTCCTTCTG 
GGTCGCTGTC 
CAITCCCTGC 
ATTTTATCTT 
ATATTTTGAT 



2040 
2100 
2160 



GATGCCCATG 
ACAGTTCTGT 
ATGAAAGAGG 



GACTATGCTA 2460 




ACGATCAAAC" 
CTGGGGATGG 
GTGAGAATCA . 



GAAGACCAGA 



CAGCTCTTGT 
GCCACAGCTG 
TTTGCAGACT 
AGGATTATGG 
TCCAACGACA 
AAGGGCTGAC 



ICGCACAGCA 



TATCAGAGGC CTATAATGAA GCTTTATACG 
AATGTAAGCT 
TTCTATCATT 
AAC-AGTAGCA 
CCAAAGGAAG 



TTTTGCTATT 
GTGCCAGGTT 
CCCCTCTGCC 
GACCATGCAG 
GTTTCTGTCA 



T GCTGTTGTTT 
C TCAGGTTCCT ATGGCTGGCC 



GCATATTCCT 
AGACTGTAGG 
TTCTGGGTGT 
GCCTCCCCAC 
AGCGCCGTGA 
GGAGAGCAGC 
CAGAGACATT 
CTAAACAAGA 
ACTGCACAGA 



AAAAGGTTCA 
GTTTATTTTA 



4020 
4080 
4140 
42 0 0 
4260 

4380 

4560 
4620 
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GTTGGTTCCA AGCCCTGGAG CCAACTGCTG CTTTTTGAGG TGGCACITTT TCATTTGCCT 5400 

ATTCCCACAC CTCCACAGTT CAGTGGCAGG GCTCAGGAIT TCC-TGGGTCT GTTTTCCTTT 5460 

CTCACCGCAG TCGTCGCACA G1CTCTCTCT CTCTCTCCCC TCAAAGICTG CAACTTTAAG 5=>20 

CAGCTCTTGC TAATCAGTGT CTCACACTGG CGTAGAAGTT TTTGTACTGT AAAGAGACCT 5580 

ACCTCAGGTT GCTGGTTGCT GTGTGGTTTG GTGTGTTCCC GCAAACCCCC TTTGTGCTGT 5640 

GGGGCTGGTA GCTCAGGTGG GCGTGGTCAC TGCTGTCATC AGTTGAATGG TCAGCGTTGC 5700 

ATGTCGTGAC CAACTAGACA TTCTGTCGCC TTAGCATGTT rGCTGAACAC "TTGTGGAAG 5760 



CAAAAATCTG AAAATGTGAA TAAAATTATT 



TAAAAAAAAA AAAAAAAAAA 582 0 



11)111 
MKDIDIGKEY IIPSPGYRSV RERTSTSGTH RDREDSKFRR TRPLECQDAL Bi«uihj»»o 
LDASMHSQLR ILDEEHPKGK YHHGLSALKP IRTTSKHQHP VDNAGLFSCM TFSWLSSLAR 
VAHKKGELSM EDWISLSKHE SSDVNCRRLE RLWQEELNEV C-PDAASLRRV VWIECRTRLI 
LSIVCLMITQ LAGFSGPAFM VKHLLEYTQA TESNLQYSLL LVLGLLLTSI VRSKSLALTW 
ALNYRTGVRL RGAILTMAFK KILKLKNIKE KSLGELINIC SNDGQRMFSA AAVGSLLAGG 
PWAILGMIY NVIILGPTGF LGSAVFILFY PAMMFASRLT AYFRRKCVAA TDERVQKMNE 
VLTYIKFIKM YAWVKAFSQS VQKIREEERR ILEKAGYFQG ITVGVAPIW VIASWTFSV 
HMTLGFDLTA AQAFTWTVF NSMTFALKVT PFSVKSLSEA SVAVDRFKSL FLMEEVHMIK 
NKPASPHIKI EMKNATLAWD SSHSSIQNSP KLTPKMKKDK RASRGKKEKV RQLQRTSHQA 
VLAEQKGHLL LDSDERPSPE EEEGKHIHLG HLRLQRTLHS IDLSIQEGKL VGICGSVGSG 
KTSLISAILG QMTLLEGSIA ISGTFAYVAQ QAWILNATLR DNILFGKSYD EERYNSVLNS 
CCLRPDLAIL PSSDLTEIGE RGANLSGGQR QRISLARALY SDRSIYILDD PLSALDAHVG 
NHIFNSAIRK HLKSKTVLFV THOLQYLVDC DEVIFMKEGC ITERGTHEEL MMLNGDYATI 
FNNLLLGETP PVEINSKKET SGSQKKSQDK GPKTGSVKKE KAVKPEEGQL VQLEEXSQGS 
VPWSVYGVYI QAAGGPLAFL VIMALFMLNV GSTAFSTWWL SYWIKQGSGN TTVTRGNETS 
VSDSMKDNPH MOYYASIYAL SMAVMLILKA IRGWFVKGT LRASSRLHDE LFRRILRSPM 960 
KFFDTTPTGR ILNRFSKDMD EVDVRLPFQA EMFIQNVILV FFCVGMIAGV FPWFLVAVGP 1020 
LVILFSVLHI VSRVLIRELK RLDNITQSPF LSHITSSIQG IATIHAYNKG QEFLHRYQEL 10B0 
LDDNQAPFFL FTCAMRWLAV RLDLISIALI TTTGLMIVLM HGQIPPAYAG LAISYAVQLT 1140 
GLFQFTVRLA SETEARFTSV ERINHYIKTL SLEAPARIKN KAPSPDWPQE GEVTFENAEM 1200 
RYRENLPLVL KKVSFTIKPK EKIGIVGRTG SGKSSLGMAL FRLVELSGGC IKIDGVRISD 1260 
IGLADLRSKL SIIPQEPVLF SGTVRSNLDP FNQYTEDQIW DALERTHMKE CIAQLPLKLE 1320 
SEVMENGDNF SVGERQLLCI ARALLRHCKI LILDEATAAM DTETDLLIQE TIREAFADCT 1380 
MLTIAHRLHT VLGSDRIMVL AQGQWEFDT PSVLLSNDSS RFYAMFAAAE NKVAVKG 



Scq ID NO: 195 DNA sequence 
Nucleic Acid Accession NM 
: 228. .1922 



I I I ' ' 

GCTGTCCTGA GCCTGAGTAC TCTAGCTGCC TTGTCGCCAT CGCATCTGGC TGCCATCCAG 60 

CGCCAGCACA CAGTAATGAG TGGCCGAGCT TCCTCTGGGA GGGAGGAAAC AGTTAAAATC 12 0 

TTGCAGCAGC TGCAATCATC TAGGCGTGGT TCTCTTGTCT GACTTGGGCT GCACAGATCC 180 

TGGGCCAAGG GACAGAAGAA AGACAGCCTA GGAGCAGAGC CTCCCAGATG GCTGAGTTGG 24 0 

ATCTAATGGC TCCAGGGCCA CTGCCCAGGG CCACTGCTCA GCCCCCAGCC CCTCTCAGCC 300 

CAGACTCTGG GTCACCCAGC CCAGATTCTG GG1CAGCCAG CCCAGIGGAA GAAGAGGACG 360 

TGGGCTCCTC GGAGAAGCTT GG CAGGG AG A CGGAGGAACA GGACAGCGAC TCTGCAGAGC 42 0 

AGGGGGATCC TGCTGGTGAG GGGAAAGAGG TCCTGTGTGA CTTCTGCCTT GATGACACCA 4 80 

GAAGAGTGAA GGCAGTGAAG 1CCTGTCTAA CCTGCATGGT GAAT T ACTGT GAAGAGCACT 54 0 

TGCAGCCGCA TCAGGTGAAC ATCAAACTGC AAAGCCACCT GCTGACCGAG CCAGTGAAGG 60 0 

ACCACAACTG GCGATACTGC CCTGCCCACC ACAGCCCACT GTCTGCTTTC TGCTGCCCTG 660 

ATCAGCAGTG CATCTGCCAG GACTGTTGCC AGGAGCACAG TGGCCACACC ATAGTCTCCC 72 0 

TGGATGCAGC CGGCAGGGAC AAGGAGGCTG AACTCCAGTG CACCCAGTTA GACTTGGAGC 78 0 

GGAAACTCAA GTTGAATGAA AATGCCATCT CCAGGCTCCA GGCTAACCAA AAGTCTGTTC 84 0 

TGGTGTCGGT GTCAGAGGTC AAAGCGGTGG CTGAAATGCA GTTTGGGGAA CTCCTTGCTG 900 

CTGTGAGGAA GGCCCAGGCC AATGTGATGC TCTTCTTAGA GGAGAAGGAG CAAGCTGCGC 960 

TGAGCCAGGC CAACGGTATC AAGGCCCACC TGGAGTACAG GAGTGCCGAG ATGGAGAAGA 102 0 

GCAAGCAGGA GCTGGAGAGG ATGGCGGCCA TCAGCAACAC TGTCCAGTTC TTGGAGGAGT 1080 

ACTGCAAGTT TAAGAACACT GAAGACATCA CCTTCCCTAG TGTTTACGTA GGGCTGAAGG 114 0 

ATAAACTCTC GGGCATCCGC AAAGTTATCA CGGAATCCAC TGTACACTTA ATCCAGTTGC 12 0 0 

TGGAGAACTA TAAGAAAAAG CTCCAGGAGT TTTCCAAGGA AGAGC-AGTAT GACATCAGAA 12 60 

CTCAAGTGTC TGCCGTTGTT CAGCGCAAAT ATTGGACTTC CAAACCTGAG CCCAGCACCA 1320 

GGGAACAGTT CCTCCAATAT GCGTATGACA TCACGTTTGA CCCGGACACA GCACACAAGT 13 80 

ATCTCCGGCT GCAGGAGGAG AACCGCAAGG TCACCAACAC CACGCCCTGG GAGCATCCCT 1440 

ACCCGGACCT CCCCAGCAGG TTCCTGCACT GGCGGCAGGT GCTGTCCCAG CAGAGTCTGT 1500 

ACCTGCACAG GTACTATTTT GAGGTGGAGA TCTTCGGGGC AGGCACCTAT GTTGGCCTGA 1560 

CCTGCAAAGG CATCGACCGG AAAGGGGAGG AGCGCAACAG TTGCATTTCC GGAAACAACT 1620 

TCTCCTGGAG CCTCCAATGG AACGGGAAGG AGTTCACGGC CTGGTACAGT GACATGGAGA 1680 

CCCCAC1CAA AGCTGGCCCT TTCCGGAGGC TCGGGGTCTA TATCGACTTC CCGGGAGGGA 1740 

TCCTTTCCTT CTATGGCGTA GAGTATGATA CCATGACTCT GGTTCACAAG TTTGCCTGCA 1800 

AATTTTCAGA ACCAGTCTAT GCTGCCTTCT GGCTTTCCAA GAAGGAAAAC GCCATCCGGA 1360 

TTGTAGATCT GGGAGAGGAA CCCGAGAAGC CAGCACCGTC CTTGGGGGTG ACTGCTCCCT 1920 

AGACTCCAGG AGCCATATCC CAGACCTTTG CCAGCTACAG TGATGGGATT TGCATTTTAG 1980 

GGTGATTTGT GGGCAGAAAT AACTGCTGAT GGTAGCTGGC TTTTGAAATC CTATGGGGTC 2 040 

TCTGAATGAA AACATTCTCC AGCTGCTCTC TTTTGCTCCA TATGGTGCTG TTCTOTATGT 2100 

GTTTGCAGTA ATTCTTTTTT TTTTTTTTGA GACGGAGTCT CGCACTGTTG CCCAGGCTGG 2160 

AGAGCAGTGG CGCGATCTTG GCTCACTGCA AGCTCCGCCT CCCGAGTTCA AGCAATTCTC 2220 

CTGCCTCAGC CTCCCGAGTA GCTGGGATTA CAGGTGCCTG CCACCACACC CAGCTAATGT 2280 

TTTGTATTTT TAGTAGAGAT GGGGTTTCAC CATGTTGGCC AGGCAGATCT CAAACTCCTG 2340 
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T GCACCCACCT CGGCCTCCCA AAGTGCTGGG ATTACATGCG TGAGCCACTG 2400 
CGTAGTA ATTTTTAGGC ACCAAATCTC CCTCATCTTC TAGTGCCATT 2460 
iGGTAAA TGTCACACTG TGCCCAGAAT GGATGACCAG G 
\ AAAGATTGCA GAGTTATCAT AATAAATTGC TAACTTGCGT 
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PLPEATAQPP 



20 
25 
30 
35 
40 
45 
50 
55 
60 



EQAALSQANG 
VGLKDKLSGI 
EPSTREQFLQ 
QQSLYLHRYY 



NAIRIVDLGE 



CPAHHSPLSA 
EKAISRLQAN 
IKAHLEYRSA 
RKVITESTVH 
YAYDITFDPD 
FEVEIFGAGT 
PFRRLGVYID FPGGILSFYG VEYDTMTLVH 
EPEKPAPSLG VTAP 



A FCCPDQQCIC 
NT QKSVLVSVSE 
A EMEKSKQELE 
H LIQLLENYKK 
D TAHKYLRLQE 



QDCCQEHSGH 
VKAVAEMQFG 
RMAAISNTVQ 
KLQEFSKEEE 
ENRKVTNTTP 



CESHLQPHQV 
TIVSLDAARR 

YDIRTQVSAV 
WEHPYPDLPS 
SGMKFSWSLQ 
KFACKFSEPV 



1GRETEEQDS 
NIKLQSHLLT 
DKEAELQCTQ 



WNGKEFTAWY 480 
YAAFWLSKKE 540 



Coding sequence: 433-1149 
1 11 21 



GCGCGTTCAG 



GGCGCAAGAG 
CACTGACTTT 

CAGCCTCACA 



TTGCTCCCAC 
GCTCCCGCTT 
GTCCCCCTCG 
CCGCCGCTGC 



GCATGGAAAG 
CCCAGCAGCC 
CCGCAGCCGC 
AGCAGCAGCA 
GTCACAAGTC 



I 

AGCGCAGCCT TAGTAGGAGA 
TGCTGCTGCT TCTGCTTTTT 
CGCGAGCGCC ACGCGAGGCT 
AGGCGGCGTG CAGGGAGGAG 
CTCCCGGGGA TTTTGTATAT 
TTTCTTTCCC TCTCTGTTCC 
ACCTCGCGTC CCGGATCGCT 
CTCTGCCAAG ATGGAGAGCG 
CTTCCTGCCG CCCGCAGCCT 
CGCAGCGGCA GCGCAGAGCG 
GCAGGCGCCG CAGCTGAGAC 
AGCGCCCAAG 



GGAACGCGAG ACGCGGCAGA SO 

TTTTTCTTAG AAACAAGAAG 120 

CCCGAAGCCA ACCCGCGAAG 180 

AAAAAGCATT TTCACCTTTT 240 

ATTTTTTAAC TTCCGTCAGG 300 

TGCACCCAAG TTCTCTCTGT 360 

CTGATTCCGC GACTCCTTGG 420 

GCGGCGCCGG CCAGCAGCCC 480 



GACAGCGCTC 



AGTGCTTTCT 



CTGGTGCGAA 



GCAACCGCGT 
GCGCGGCCAA 
GCGCGCTGCA 
TCCIGTCGCC 
CGGTCTCATC 
AGCITCTCGA 
TGGACTTTGG 
GTTGGGAGGG 



GCAGCAGCAG 
CGGCCAGCCC 
GTCTTCGCCC 
CCTGCCGCAG 
CAAGTTGGTC 



CCAACCCCAT 
AGCGCTCAGA 
ACCTGAGTCA 
GAGCAGCACA 
GCTCGGGTCC 



CGCCAACTAA GCGAGGCATG CCTGAGAGAC 
ACAGTATCTT TGCACTCCAA TCATTCACGG 
ATGCGCAAAA TGCAGCTTGT GTGCAAAAGC 
CGCGTTATAG TAACTCCCAT CACCTCTAAC 
CTTCACCTCC CCGCCCTTTC TTAGAGTGCA 



ATGGCTTTCA 
AGATATGAAG 
AGTGGGCTCC 
ACGCACAGCT 
GTTCTTAGCC 



CACCATCTCC 
CTACTCGTCG 
CTTCACCAAC 
AAGCAGGGTG 
GGAGAAAAGG 
AAACAGTCAA 
GAAAACGGGA 
AGCAACTGGG 
TGGCAGAAGG 
GAAAGTTCTT 
CTCTAGAAAC 



G GAGQQPQPQP QQPFLPPAAC FFATAAAAAA AAAAAAAQSA QQQQQQQQQQ 
QQQQAPQLRP AADGQPSGGG HKSAPKQVKR QRSSSPELMR CKRRLNFSGF GYSLPQQQPA 
AVARRNERER NRVKLVNLGF ATLREHVPNG AAWKKMSKVE TLRSAVEYIR ALQQLLDEHD 
AVSAAFQAGV LSPTISPNYS NDLNSMAGSP VSSYSSDEGS YDPLSPSEQE LLDFTNWF 



ATGACAGAGA 
TGCAGCCCCC 
AAGGTGGGAG 
GCCT1CTACT 



I 



I 



TTTAAAATGG 

ATTCCTGAGG 
ATGCCAGTCA 
GACAACAGCT 
CTTAAACCAA 
GTTCCAACTA 
CTGAATAATG 



GCCCTGGTGG GACCTGATGA CGTGGAATTC 
CGGCGTACGC TACGCTGACG G 
CCGTGGTCCT CATTTCGGGA Gl 
TCTGGAAGGG GAGCGACAGT CACATTTACA A 
AACTACAAGA TGGGTCAATG G 
GAAGTGGAGC TGAAGAAGCA ATTGCAGTTA A 
AGGAGAGAAG T 



TGCTCTTTGG 
ATGTCCATTA 
CTGGGAACAA 
ATGATTTCCA 
AAGCGCAAGT 
CCAAACTGGA 



GAAGGCTCGT 



CTGTA 



AATATGAAGA AAATTCTCTT ATCTGGGTGG C 
TCTTGAGTTC TAAGGTGTTA GAACTCTGCG G 
CCTATCCAAA AG AAAT CCAG AGGGAAAGAA G 
CCACAAAAAG ACCACACAGT GGACCACGGA G 
AAACCAGACC CAGTGTTCAA GAGGACTCAC AAGCCTTCAA ICCTGATAAT 780 



\ GCCTGTGAAG 
Z TATTTTCTGG 
r AAGAAAAATT 
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CCTTATCATC AGCAGGAAGG GGAAAGCATG ACATTCGACC CTAGACTGGA TCACGAAGGA 840 

ATCTGTTGTA TAGAATGTAG GCGGAGCTAC ACCCACTGCC AGAAGATCTG TGAACCCCTG 90 0 

GGGGGCTATT ACCCATGGCC TTATAATTAT CAAGGCTGCC GTTCGGCCTG CAGAGTCATC 960 

ATGCCATGTA GCTGGTGGGT GGCCCGTATC TTGGGCATGG TGTGAAATCA CTTCATATAT 102 0 

CACGTGCTGT AAAATAAGAA CTAGCTGAAG AGACAACCAA AGAAGCATTA AGGCAGGTTG 10 eO 

ATGCTGATGG GACCATAAAA TATTTTTACA CGCAGCCTGA GCGGTTATTC TTGACACTCT 1140 

TAACAGAATT TTTTTAATCG TTTTCCAGAA CTTTAGTATA TGCAAATGCA CTGAAAGGGT 1200 

AGTTCAAGTC TAAAATGCCA TAACCCCGTT ATTTGTTATT TTTTATTTGC ATTGATTTGC 1260 

CATAAGTCTT CCCTTGCTTG CATCTTCCAA AGCTATTTCG AAATAAACAC GAAAATTTAC 132 0 
AGTTTGCC 



MTENSDKVPI ALVGPDDVEF CSPPAYATLT VKPSSPARLL KVGAWLISG AVLLLFGAIG 60 

AFYFWKGSDS HIYNVHYTMS INGKLQDGSM EIDAGNNLET FKMGSGASEA :AV1IDFQNGI 12 0 

TGIEFAGGEK CYIKAQVKAR IPEVGAVTKQ SISSKLEGKI MPVKYEENGL IWVAVDQPVK 180 

DNSFLSSKVL ELCGDLPIFW LKPTYPKEIQ RERREWRKI VPTTTKRPHS GPRSNPGAGR 240 

LNNETRPSVQ EDSQAFNPDN PYHQQEGESM TFDPRLDHEG ICCIECRRSY THCQKICEPL 300 
GGYYPWPYNY QGCRSACRVI MPCSWWVARI LGMV 

Seq ID NO : 2 01 DNA sequence 

Nucleic Acid Accession NM_000728.2 

Coding sequence: 112.. 495 

1 11 21 31 41 51 

GTAATAAGAG CGGGGTCTCC GCGGGGAAGG CGCCCACAGC AGGTGTGGTG TTCATCCCGG 60 

GTCGACCGGC CGCTCGCGCT GCCCTGAAAC TCTAGTCGCC AGAGAGGCGG CATGGGTTTC 120 

CGGAAGTTCT CCCCCTTCCT GGCTCTCAGT ATCTTGGTCC TGTACCAGGC GGGCAGCCTC 180 

CAGGCGGCGC CATTCAGGTC TGCCCTGGAG AGCAGCCCAG ACCCGGCCAC ACTCAGTAAA 240 

GAGGACGCGC GCCTCCTGCT GGCTGCACTG GTGCAGGACT ATGTGCAGAT GAAGGCCAGT 300 

GAGCTGAAGC AGGAGCAGGA GACACAGGGC TCCAGCTCCG CTGCCCAGAA GAGAGCCTGC 360 

AACACTGCCA CCTGTGTGAC TCATCGGCTG GCAGGCTTGC TGAGCAGATC AGGGGGCATG 420 

GTGAAGAGCA ACTTCGTGCC CACCAATGTG GGTTCCAAAG CCTTTGGCAG GCGCCGCAGG 480 

GACCTTCAAG CCTGAGCAGA TGAATGACTC CAGGAAGAAG GTGTGTCCTA AATCCAATGA 540 

CATATCCTTA TAAGAGATTC ACTCAGAAGA CACATGTGGA GAAGGTGACA TGACAGAGGC 600 

AAGGAGGCAC AAGCCAAGGA AGTCTGTGTC TACCAGAAGC CAGAATCACA GAACAGTCTC 660 

TGGAAGAAGA GCAGCCCTGC TGACACCTAG AGTTTGGACT TCCAGCTTCC AGAACTGTGA 720 

GAGAATAATT TCTGTTGTTT TAAGCCACAA AGTTTGTGGT AATTTGTTAT GACAGCCCTA 780 

GGAAACTAAT ACAATACATT TTCATTTATT TTGGGTAAAT GCCTTGGAGT GGGATTGCTG 840 

GGTTATTTGG AAAGTGTGTA TTTAACTCTG TAAGAAACTG CCAAACTATT TTCTGAAGTG 900 

ACTGTACCAC TTCGCCTTCT TGCCAGCCAC ATATGAGAGC TCTAGTATTT CCACAAATAG 960 

GTATGTAGCA GTATCTCATT GCTGTTTTAA TTTGTATTTC CCCAATGACT AATGACGTTG 102 0 

AGCATCTATT TTACCATATG TTTATCACCT TTATTGAAGG GTCTGTTTAA ATCTTCTGCT 1080 

AAATTTTTCT TGGCTTGCTT GCTTTATTAG TGTTGAGTTT TTAGAGCTCT TTATATGTTG 1140 

TGGATGCAAG ATTGTTTTCA GATATATAGT TTGGAAACTT CCTTCCCCTG AATCTGCGGA 1200 

TTGCTTTTTC ATTTTCTTAG CAGTGTCTCT CACAGAGAAA AAGTTGTAAT TTGAATAAGA 1260 

TCCAATTCAT CTTTTTTTTT CTTTTATGTA TTGTGCTTTT AGTTCATGTC TAAGAACTCT 132 0 

TTGCCTAACT AAGGTCCCAA GGTCACAATA ACCTTATTCT ATACTTTCTT GTAAAAGTTT 1380 

TATAGTTTTA TATTTTATAT GTAGATTAGT GATCTATTTT GAGTTAATTT TTGTATAAGG 144 0 

TGAGAGGTGT AGGTTGAAAT TCATACCTGT GAATATAGAT ACCCAATTGT TTCAGTGCCA 1500 

TTTGTTAAAA AGACTGTTAT TTCACCAITT AATTGCCCCT GCACCTTTGT CAAAAAGCAA 1560 

CTGATCATAT TTGTGTGGGT ATATTTCTGG GTTCTCAATT CTGTCTCATT GATTGATTTG 1620 

ACCATTCTTT TGCCAATGTC ATACTGCCTT GATTAGTGTA GTGTTAAAGT GAATCTCAAA 1680 

ACCAGATAAT GTGGGTCTAC CAACATTGTT CATTCTTGTT CAAAAAGATT TTAGCTACAT 1740 

CTAAAATATT TTCTACATCT TTTATACATT TTAGAATCAG TGTGTTACTA TCTACAAAAT 1800 

TTCTGATGAG ATTTTTAATG GGATTGTGTT AAAT CAGTGG GTTAATTTTG GGAGAATTAG 1860 

CATATTAATA ATATTAAGTC GTTCAATTCA TGAACACAAT ACATGTTTTC ACTTATTTAG 1920 

GTTTTCTCTG TTTTTTTTTT TTTAACAGTG TTCTCAGTTT TCAACAGAAA TATTCTACAC 1980 

ATATCTTGTT AGATTTTTAA CTATTTTATT TTTTGGTGCT AATGTAAATG GTACTTAAAC 2040 

ATTTTTGTTT TTAATTGTTC ATTGCTAGTA GATAGAAATA CAATATTTAA AATATTAGGA 2100 
AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 

Seq ID NO: 2 02 Protein sequence: 
Protein Accession fh NP_000719.1 

1 11 21 31 41 51 

1 | I I I 1 

MGFRKFSPFL ALSILVLYQA GSLQAAPFRS ALESSPDPAT LSKEDARLLL AALVQDYVQM 60 
KASELKQEQE TQGSSSAAQK RACNTATCVT HRLAGLLSRS GGMVKSNFVP TNVGSKAFGR 120 
RRRDLQA 

Seq ID NO: 2 03 DNA sequence 
Nucleic Acid Accession #: NM_001741 
Coding sequence : 71 . . 496 

1 11 21 31 41 SI 

I I | I I I 

CTCTGGCTGG ACGCCGCCGC CGCCGCTGCC ACCGCCTCTG ATCCAAGCCA CCTCCCGCCA 60 

GAGAGGTGIC ATGGGCTTCC AAAAGTTCIC CCCCTTCCTG GCTCTCAGCA TCTTGGTCCT 120 

GTTGCAGGCA GGCAGCCTCC ATGCAGCACC ATTCAGGTCT GCCCTGGAGA GCAGCCCAGC 180 

AGACCCGGCC ACGCTCAGTG AGGACGAAGC GCGCCTCCTG CTGGCIGCAC TGGTGCAGGA 240 

CTATGTGCAG ATGAAGGCCA GTGAGCTGGA GCAGGAGCAA GAGAGAGAGG GCTCCAGCCT 300 
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GGACAGCCCC AGATCTAAGC GGTGCGGTAA TCTGAGTACT TGCATGCTGG GCACATACAC 
GCAGGACTTC AACAAGTTTC ACACGTTCCC CCAAACTGCA ATTGGGGTTG GAGCACCTGG 
AAAGAAAAGG GATATGTCCA GCGACTTGGA GAGAGACCAT CGCCCTCATG TTAGCATGCC 
CCAGAATGCC AACTAAACTC CTCCCTTTCC TTCCTAATTT CCCTTCTTGC ATCCTTCCTA 
TAACTTGATG CATGTGGTTT GGTTCCTCTC TGGTGGCTCT TTGGGCTGGT ATTGGTGGCT 
TTCCTTGTGG CAGAGGATGT CTCAAACTTC AGATGGGAGG AAAC-AGAGCA GGACTCACAG 
GTTGGAAGAG AATCACCTGG GAAAATACCA GAAAATGAGG GCCGCTTTGA GTCCCCCAGA 
GATGTCATCA GAGCTCCTCT GTCCTGCTTC TGAATGTGCT GATCATTTGA GGAATAAAAT 
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25 
30 
35 
40 
45 
50 
55 
60 



L ALSILVLLQA GSLHAAPFRS ALESSPADPA TL3EDEARLL LAALVQDYVQ 
MKASELEQEQ EREGSSLDSP RSKRCGNLST CMLGTYTQDF NKFHTFPQTA IGVGAPGKKR 
DMSSDLERDH RPHVSMPQNA N 

Seq ID NO: 205 DNA sequence 
Nucleic Acid Accession #: NM_0053 61 
Coding sequence: 1-94S 



ATGCCTCTTG AGCAGAGGAG TCAGCACTGC AAGCCTGAAG 
CCTGCTACTG 
GGGGAGGTGC 
TTCTCGACTA 



TCCTCTTCTA 
CCTCCCCACA 
AGACAATCCG 
CTGGAGTCCG 
CTCCTCAAGT 
AGAAATTGCC 
GTCTTTGGCA 
TGCCTGGGGC 
CTCCTGATAA 
ATCTGGGAGG 



CTCTAGTGGA 
GTCCTCAGGG 
ATGAGGGCTC 
AGTTCCAAGC 
ATCGAGCCAG 
AGGACTTCTT 



TGCGCAGGCT 
AGTTACCCTG 
AGCCTCCAGC 
CAGCAACCAA 



GGAGCCGGTC 



TCTCCTACGA 
TCGTCCTGGC 
AGCTGAGTAT 



GTGCCCGGCA 
ACCAGCTATG 
TACCCACCCC 



GTGATCCTGC 
TGAAAGTCCT 
TGCATGAACG 



GCAAGATCTG 
ATGCTACGAG 
GCACCATACA 
GGCTTTGAGA 



AGGAAGATGG 
ACAAAGGCAG 
TTCAGCAAAG 
CCCATCAGCC 



GTGCAGGAAA 



AAGGCCTTGA 
AGGAGCAGCA 
CTGCTGCCGA 
CCATCAACTA 
GGCCAAGAAT 
TTGAGTTGGT 
AAAIGCTGGA 
CCTCCGAGTA 



I 



I 



TCATTTTCTG 
GAGTGTCCTC 
CTTGCAGCTG 
CCTTGTCACC 
CAAGACAGGC 
TGAGGAGAAA 
TGTCTTCGCA 
GTACCGGCAG 
CCTCATTGAA 
TCACATTTCC 



51 



MPLEQRSQHC KPEEGLEARG EALGLVGAQA PATEEQQTAS SSSTLVEVTL G 
PPHSPQGASS FSTTINYTLW RQSDEGSSNQ EEEGPRMFPD LESEFQAAIS RKMVELVHFL 
LLKYRAREPV TKAEMLESVL RNCQDFFPVI FSKASEYLQL VFGIEWEW PISHLYILVT 
CLGLSYDGLL GDNQVMPKTG LLIIVLAIIA IEGDCAPEEK IWEELSMLEV FEGREDSVFA 
HPRKLLMQDL VQENYLEYRQ VPGSDPACYE FLWGPRALIE TSYVKVLHHT LKIGGEPHIS 
YPPLHERALR EGEE 



Coding sequence: 



CCCAAACTAA 
CCCTTTGGGT 
GCACCCTGAA 
GGGCGAGCTG 
ACCGCTGCTT 
TTCGCTCAAG 
CACTGTCCAA 
CACGGAGAAG 



AGGGAGGGAG 
TTAGGAGGGC 
CTGGTGTCTT 
CC1TACCTCC 
GAGAGAGTGG 



AAAGGAGAAG TTGGTTTAGA GGCCAGCCGG ACGAGCTTTG 



CA3GTGAACT 
AGGGCAGGGT 
CCTGGCCCAC 

CCCTTCACTT 



ACCGAAAGGA 
ACATGGCCCA 



GCAGCTGAGG 
GTCCCAGGGC 
GGACCCCATC 
GAGTGCGGTC 
TGTGGCCCAC 



CCCAAGGCCA 



GAGATGCTAG 
CTGGCAAAGA 
C3GAA3TGCT 
CAGCCCTGTC 
AGAAACTGCC 
CCTCCGCAGC 
TCTCCTCCTC 
AGGAGGCATC 
CCGCACCCCT 



GACCACTACC 
GCCCCTCAAC 



TCCACCATTA 
AGCTTCTCCA 
AACTTTCTGG 
GTGAAGAGTG 



CCGAAGCCCC ACCAACACCA 



CCTGGGCTAT G 



TCACCACCAC 
ATCCTGAGGG 
AGTGCACATA 
TGAACCTGTC 
TCCTGGCCAA 
TCTCCGTCTA 
AGGCCTTCAT 
TGGACCTGCA 
GCGCTAAGAT 



GACTGGGTCA 
GGTCATCACC 
GTACATTGAC 
CAACGTGACA 
CGATGGGGAA 
CCAGACACTC 
CTTCCGGACC 



CAG3AGGACA CCAGCCCCAT 
GCCTCAGAGG A3AGCCAGGA 
ACOSAGCAGG CACCAGCTCT 
TCCAGCGACT ACCCACTGCT 
GTCTACACTG GCTATGGGGT 
CTGCTCTCCA TCCGCGGGGT 
CTGC-TGGAGG GGCAGGTAAT 
TTCCAGGACG ACGGCCTTGG 
AACTTTCCCC GCCGGCCTGA 



G AGGGCCAGAA G 



G CA'CTTTGAGA GGCTGTTGCT 
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GCATGACAAG GACAGGATGA 
CGACTCCCTT CAAACCGAGA 
CCGCATCGAG TTCACGTCCG 
AGCGTTTGAG AAAGGCCACT 
CGACCCGACC TATAACATTG 
GGAGCAGGGC CCGGCCATCA 
AGAGCCCCTG TGCAGAGCCA 
GTCCCCAAAC TGGCCCGAGC 
GGGAGAAGAG AAACGGATCT T 



GCTATGAGCC 
GG AC TATAGT 
TCGAATGCAT 



CCTACGTGGA 



AAGGAATGAC TCCTGCTCGG 
CACGGAGTIG GTGCGGGGAG 
GGGGAGTGAC ACCCTCACCT 
TGAGAAAATT ATGTACTGCA 
GGATCCTGTG CTGCTGGTGG 
TGAAGGGAGT 
TCGCCTGCCC 

T ATCTTCATCC 



CCAGAATCAC 



TGAGGGCCTG 
GGCGGCCTCC 
CTACATCCAG 
GGAGTTCACC 
CAATGTGCGG 
GGAGCTCTCT 
AGGTGAAGAT 
CCAGTTCCTG 
GCCCCACATC 
GCCAGACTTA 
GGGATTTATC 
GATCCAGAAT 
CTACCAGTGT 
CCTCAGCTGG 
AGAGGTGGAT 



AACAAGTCAG CTCTTCTCTA 
CTGAGCGAAG 3CAACACCAT 
ACCTTCAACA TCCGATTTGA 
AATGGGAACT TCACTACATC 
TGCGACCCCG GCCACTCCCT 
GACCCATACT GGAATC-ACAC 
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GGCTGGAAAA 



CTACAGCCAG ATCACCGTGG 
CCAAAAGGTT TAGGGTTTCA 
AACCCCAATT TCCCCGAGAC 
AAAGGCGGCT GTTTTTTGGT 
TTTATAAATT TTAAAAGTG 



CAGAAGCGGC 
CGGTCCTCAT 
ACTATTCCAA 
AAACCGAGTT 
TTTAAAAAGA 
ATTTATCCAA 
TAAACTTTTT 



G CCGTGAAACA 



CATCTCCTTA 

TGACAACCCC 
GGTACCCTTT 
AGGCCCTGGG 
AACAAAGGGT 



AGCAGCGACC 
CACTCGACCC 
TGCAACCCCG 
GGGACTCCCA 
TCGCTGGAAG 
CTGCTGGGAG 
CCICTGATGT 
ATTTACGAGA 
AAAAAGGGGC 
GGCCTTGATT 



AGATCCACGT 

ACAGTGACAT 

ACCTTGGGAA 

TCCATTCGGA 

TAGAGGTATC 

CCACTTCTCA 

ATGACATCGT 

CCCCATTTTG . 

GCTTAATTTC 



ACTCCCACCC 



TACGGGTTTT 



TTGTGAACTC 
TAAACCCCCA 
TTCCCCGGAT 



2940 
3000 
3060 



PEGYIDSSDY 
LANQTLLVEG 
DLHSGGVAHF 



SPMALMDKGE 
PLLPLNNFLE 
QVIRSPTNTI 
HCHLGYELQG 
FCIWTIEAPE 
KTIRIEFTSD 
HSLEQGPAII 
IHVGSEKRIF 



21 31 
I I 
' NELTGSASEE SQETTTSTII 
CTYNVTVYTG YGVELQVKSV 
SVYFETFQDD GLGTFQLHYQ 
AKMLTCINAS KPHWSSQEPI 
GQKLHLHFER LLLHDKDRMT 
QARAASTFNI RFEAFEKGHC 
ECINVRDPYW NDTEPLCRAM 
LDIQFLNLSN SDILTIYDGD 
KGQGFIMNYI EVS3NDSCSD 



VPFEGLLSEG 
TIVEFTCDPG 
YVEGEDCIWK 
SSTPDLTIQF 
RI TYQCDPGY 
TTIQYTCNPG 
VLIISLLLGG 



Seq ID NO: 209 DNA sequence 
Nucleic Acid Accession #: NM 001327.1 
: 89-631 



TTTVITTEQA 
NLSDGSLLSI 
AFMLSCNFPR 
CSAPCGGAVH 
VHSGQTNKSA 
YSPYIQNGNF 



I 



1YCT DPGEVDHSTR 



RGVDGPTLTV 
RPDSGDVTVM 
NATIGRVLSP 
LLYDSLQTES 
TTSDPTYHIG 
WLSPNWPEP 
LGNSGPQKLY 
TSHTELVRGA 
LISDPVLLVG 
GNMALAIFIP 
GGTQKV 



GGCTCAGCCT 
GCCTCCTCCC 
GTTTGTCGCT 



CCGCTTCCCG 
ATCCGACTGA 
CAGCTTTCCC 
CCCTCAGGGC 
CTAGGGAATG 
GGAGGAGGAC 



CCGAGAATAC 
CCGGAGCCAT GCAGGCCGAA 
CAGGAGGCCC TGGCATTCCT 



TGCCAGGGGT GCTTCTGAAG 
CTGCTGCAGA CCACCGCCAA 
TGTTGATGTG GATCACGCAG 
AGAGGCGCTA AGCCCAGCCT 
GTCCCAGCAC GAGTGGCCAG 
GGCTTACATG TTTGTTTCTG 



41 51 

I 

CGTGGGCCCT GACCTTCTCT 
GGCCGGGGCA CAGGGGGTTC 
GATGGCCCAG GGGGCAATGC 
CCCCGGGGCG CAGGGGCAGC 
CATGGCGGCG CGGCTTCAGG 
AGCCGCCTGC TTGAGTTCTA 
GCCCGCAGGA G 



CTGCAGCTCT CCATCAGCTC 
TGCTTTCTGC CCGTC-T7TTT 
GGCGCCCCTT CCTAGGTCAT 
TTCATTGTGG GGGCCTGATT 
TAGAAAATAA AACTGAGCTA 



MQAEGRGTGG STGDADGPGG PGIPDGPGGN AGGPGEAGAT GGRGPRGAGA ARASGPGGGA 
PRGPHGGAAS GLNGCCRCGA RGPESRLLEF YLAMPFATPM EAELARRSLA QDAPPLPVPG 
VLLKEFTVSG NILTIRLTAA DHRQLQLSIS SCLQQLSLLM NITQCFLPVF LAQPPSGQRR 



Seq ID NO: 211 DNA sequence 

Nucleic Acid Accession #: Eos sequence 

Coding sequence: 52-459 



I 



I 



41 



I 



CCTCGTGGGC CCTGACCTTC TCTCTGAGAG CCGGGCT AG GCTCCGGJ CATGCAGGCC 
GAAGGCCAGG GCACAGGGGG TTCGACGGGC GATGCTGATG GCCCAGGAGG CCCTGGCATT 
CCTGATGGCC CAGGGGGCAA TGCTGGCGGC CCAGGAGAGG CGGGTGCCAC GGGCGGCAGA 
GGTCCCCGGG GCGCAGGGGC AGCAAGGGCC TCGGGGCCGA GAGGAGGCGC CCCGCGGGGT 
CCGCATGGCG GTGCCGCTTC TGCGCAGGAT GGAAGGTGCC CCTGCGGGGC CAGGAGGCCG 
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GACAGCCGCC TGCTTCAGTT CCGACTGACT GCTGCAGACC ACCGCCAACT GCAGCTCTCC 3 60 

ATCAGCTCCT GTCTCCAGCA GCTTTCCCTG TTGATGTGGA TCACGCAGTG CTTTCTGCCC 42 0 

GTGTTTTTGG CTCAGGCTCC CTCAGGGCAG AGGCGCTAAG CCCAGCCTGG CGCCCCTTCC 480 

TAGGTCATGC CTCCTCCCCT AGGGAATGGT CCCAGCACGA GTGGCCAGTT CATTGTGGGG 540 
GCCTGATTGT TTGTCGCTGG AGGAGGACGG CTTACATGTT TGTTTCTGTA GAAAATAAAG 600 
CTGAGCTA 

Seq ID NO: 212 Protein sequences 



MQAEGQGTGG STGDADGPGG PGIPDGPGGN AGGPGEAGAT GGRGPRGAGA ARASGPRGGA 
PRGPHGGAAS AQDGRCPCGA RRPDSRLLQF RLTAADHRQL QLSISSCLQQ LSLLMWITQC 
PLPVPLAQAP SGQRR 




A CCAGCTTGGT GGGGAAAGGG T7TGATGAAT SO 

AGCACAAAGA CACTGGCTGT TCCCTGGAGG CTGTCCCTTT AAAGGAGAAT CTTAGTTTAT 12 0 

TCTGGGGGGA GGGGATGCAC ACATTAGAGT AGGAAAGAGG GCTTGGAATA AAATGAAAAC 180 

ACTCCCCCTT CATAGTCATT GTACTGAAAT GCAAAGACTG CTTCCTAAGC TGGAGATGCT 24 0 

AACCTTGGGT AGCTCCTTCT GTTCTCTTCA AGGGGAATTT TGTCAGGCTA TGGATTCATT 300 

TACAACTGTT AGTCATGTGG GCATGTGTGA GGAAACAGAT GCCAGTTTTA ATGTATTTAG 3 60 

CCCGAAGTTC CAATTTGATA GGAGCCACTG TCAGTCTCTG AGG7TCCACC AAAATATGGA 42 0 

ACTTGATTTT GGACACTTTG ACGAAAGAGA TAAGACATCC AGGAACATGC GAGGCTCCCG 4 80 

GATGAATGGG TTGCCTAGCC CCACTCACAG CGCCCACTGT AGCTTCTACC GAACCAGAAC 54 0 

CTTGCAGGCA CTGAGTAATG AGAAGAAAGC CAAGAAGGTA CGTTTCTACC GCAATGGGGA 600 

CCGCTACTTC AAGGGGATTG TGTACGCTGT GTCCTCTGAC CGTTTTCGCA GCTTTGACGC 66 0 

CTTGCTGGCT GACCTGACGC GATCTCTGTC TGACAACATC AACCTGCCTC AGGGAGTGCG 72 0 

TTACATTTAC ACCATTGATG GATCCAGGAA GATCGGAAGC ATGGATGAAC TGGAGGAAGG 780 

GGAAAGCTAT GTCTGTTCCT CAGACAACTT CTTTAAAAAG GTGGAGTACA CCAAGAATGT 84 0 

CAATCCCAAC TGGTCTGTCA ACGTAAAAAC ATCTGCCAAT ATGAAAGCCC CCCAGTCCTT 900 

GGCTAGCAGC AACAGTGCAC AGGCCAGGGA GAACAAGGAC TTTGTGCGCC CCAAGCTGGT 960 

TACCATCATC CGCAGTGGGG TGAAGCCTCG GAAGGCTGTG CGTGTGCTTC TGAACAAGAA 102 0 

GACAGCCCAC TCTTTTGAGC AAGTCCTCAC TGATATCACA GAAGCCATCA AACTGGAGAC 1080 

CGGGGTTGTC AAAAAACTCT ACACTCTGGA TGGAAAACAG GTAACTTGTC TCCATGATTT 114 0 

CTTTGGTGAT GATGATGTGT TTATTGCCTG TGGTCCTGAA AAATTTCGCT ATGCTCAGGA 12 00 

TGATTTTTCT CTGCATCAAA ATGAATCCCC ACTCA7CAAC GGAAACCCAT CAGCCACAGC 1260 
TGGCCCAAAG GCATCCCCAA CACCTCAGAA GACTTCAGCC A 
CCGAAGCAAG TCTCCAGCTG ACTCAGCAAA C 
CAAGTCTAAC CACTCTCCCA TCTCTACCCC C 
GGACCTGTAC CTGCCTCTGT CCTTGGATGA CTCGGACTCG C 
GAGGGGAGAG TGCTCAGAGT CCAGAGTACA A 

TCTGCTCAAG TGTCCAACAG GGCTATTGGT GCTTTCAAGT TTTTAT 
TATTTTGAAA AACACATTGT AATATGTTGG GTTTATTTTC CTGTGATTTC T 
CACTGATCCA CAGTT AC CAA TTATGAGAGA TAGATTGATA ACCATCCTTT GGGGCAGCAT 
TCCAGGGATG CAAAATGTGC TAGTCCATGA CCTTTCAATG GAAAGCTTAG GGGCCTGGGG 
TAAATTTGCC CCGTTTAAAT TTGCCCAAAC AGTTTTCCTT TTGTAGAGGG GTGTTTAAAT 
T GTGTGGGGAA AAAAAAAACT CATTGGCAGA TCCAAGAATG 
V3AATGG TGGAGGACCC TGGAAGGACA 
T CACTCTTCAC TCCTGATTGA GGCCCGGGTT TGTTGTCCAG 
CACCAATTCT GGCTGTCAAT GGGGAGAAAT AAACCAACAA CTTATAATTG TGACACCAGA 
TGCTTAGGAT CCTGGTGCTG GGTTAGCTAA GAGAATAGAC AGAATTGGAA AATACTGCAG 
ACATTTCCGA AGAGTTTATA AAGCACAGTG AATTCCTGGT CAATCTCTCC ACTGAGGCAA 
T TGGAGTAAGG GACTTCATAT A 
T TACATGAACT GTATGGTATC C 

TGTTCTATTG AATGCCTTGT TAACAGCCAA CACTGAAAAC ACTGTGAGAA TTTGTTTTCA 24 00 

GGTCTGACAC CTTTCAGTCT CTTTTTATAG CAAGAAATCA ATATCCTTTT TATAAAAATT 24 60 

Z AAACTCTTCA GGCTCCTTTT TTATAAACTG GTGATTTTTC 252 0 

A AAAACACATG AAGAAAATTT ACCAGAAAAA AAAAAAAAAG CCGAAGAATA 2580 

ATGTTATTTA GAAATTATGC TGTCACTGCC AAACAGTAAC CTCCAGGAGA AAACAAGATG 264 0 

AATAGCAGAG GCCAATTCAA TAGAATCAGT TTTTTGATAG CTTTTTAACA GTTATGCTTG 2700 

CATTAATAAT TTCAATGTGG ACCAGACATT CTAATTATAT TTTAAATGAA ATGTTACAGC 2760 

ATATTTTAAG CAACTCTTTT TATCTATAAT CCTAATATTT CATACTGAAG ACACAGAAAT 282 0 

CTTTCACTTG TCTTTAACAT TAGAAAGGAT TTCTCTTTAC TAAGGACTGA TCATTTGAAA 2880 

TAGTTTTCAG TCTTTTGAGA TACAGGTTTA TAACACTGCT TTTTTTTTCC TGTAAACATA 2940 

GCCCATAATG GCAAAAACAA CTAATTTTAA TTGAAGGTCT TGCTTGCCAN TCCTGTGTTG 3000 

GCTTTNACCA AATATAAAAA TTCCCTTATT CCTTGGTAAT GGTGCAAATN TTTGGAAAGG 3060 

CACAGCATCC AAACCAAGCT GCTGTTTGGC TACTGAATGG CTTGCAGTTG TTCCTCCACT 312 0 

CTAAATGGAA TGAGCTTGCT GTGTGTGTGT GTGGTGGTGG TGGGAGGGGG TGGTGCATGT 3180 

GTGTGTGTGT GTGTGCATCT GCAGCTGCTT CAAAATTAAG AAATACTACA AGACACCCCT 3240 

GTAATGGATT GGTGGCAACT GGGTGGCACT GCTGATGTGC ACTGTGTAGG GGGGAACCCA 33 00 

GTGGTGGTGG GGTATCTCAA ATGCCCCTAG ACAAGCTTCA GATGTCTG7A GCTACCAAAA 33 60 

ACATTTTCGG TTCAAGAAAA GTGAGATGAT GGTAGTACTG GTTTCTGGTG AAATTGAAAA 3420 

ACCCCAAATG ATGAGGATCT CTTTTTGCCC CCTCTCCTTT TTTTGTAAAC CCATTCAAAA 34 80 

CCATTAATAA GCCCATTTTA CTAAHCCCCT ATTTCTTTCT AGAAGCTCAG GGTTTNCTTA 3540 

GTGCCTCCCA NAACATTTTG TAGTTAATTG GGAAAAAGTG ATACTTGGAT TAGGGGGTGT 3600 



CACC CATAGT NTCACTTTAG GTCTCATTTA GTCCATCACC TTTATTTTAA GTTGAGGAAG 3720 

TGGAGGCTGG TAAAGAGCAG GACCAGAGGA AGAATCCAGA TTTCCTTATG CTTGGGCCTC 3780 

ACACTAGCTC TNTGAGTATT TCCTTGATTG CGGTATATGT ACTACTAGAA AATACCAAAT 3840 

GGATATATTT TCTTTAGGAT AACCTTTGAA CCAACAATNT TCAATAACAA TAGTACATCT 3900 
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TCCATCTTAC TTTTAATCGA GTATAAGGAA ATGTTTCTTT ATGGCCATTT 7GGAGGGAGC 3960 

AGGGGATGAG GCTTGGCATA GTCCAAAATT TAAGNCTCCA ATAATTAATT GCATTTTAAA 402 0 

T GTGTCTGTAA CTGAGCTCCT 4080 

T CATTCACTTC CAATTTTACC CAATCCAATT TTAGCACTCA AGTTCCATTG 4140 

TGTTAATTTT TGCACGGTCT ACACACATCA AGTCAGCAAG CATTTGCCAC CACTCCCTAT 42 00 

ACTTCTCCCT CTTTTTTACA CACACACACA CACACACACA CACAATCCAT CTCTTGCTTG 4260 

TTCCTACCTC CCTGATTTTT CTTCCCTACA GAAATAGAAA TAGGGACAAA GAAGGGGAAA 432 0 

ATGTATATAT TGGGGCTGGG CTGAACAACT AACTTCATAA GTAGTA7TAA CTAGGGGTAA 4380 

ATTGAGAGAA AAGCTCCTTT TCTCTTCACT GTTTTGGAAA GGATAGCCAT TAGCATGACT 444 0 

GCTTTGTGTC CTTATGGACT TTAGTATTAG CCTAGATTGA ATTATAGCGT TTTTCTAGCT 4500 

GAAGGAACCT TAAGATCACA TCATCTACTC CTCTACTCCA AATTTCTCAT TCTTCAGGCC 45 SO 

AGGAAACCGA GACACAGAGG TAAAGTAATT TCCCCAAGGT CACACAGCTG GCTGGGGCAG 462 0 

G TCATGTTGCA T 

C AACCTAATAC AGCCCCTTTT 4800 

3TCCTTGAAG AGAGTGAGGC 4 860 

ATTGAGGGCC AATAGGAGCA ATGGGGTCCC TGGCCTTGTC CATCTGATTC AGGAGATCAC 4 92 0 

TGCTCCATCG TGAGGAGCCC TCTGAATAGC CCCCCACTGA ATGCTTGCCT TGCCCAAATG 4980 

GAATGGAGGA AGATTGATTT TCTCCATCAG TTCACCTTGT GTCATC7CAT AATGGTTGGT 504 0 

CTTTCCAGGC TGAGGGAAAT GTTTCTTGTT TCCAMAGTAN AAAAAAGAAA GAGTGGAACA 5100 

ATANCTTTGT TCATCCTAAC TTTCTGAGAT GGCTTTTCAA CATTTAAAAA AAACTAGTGT 5160 

GGTACCATTC ACTGGCANGA TTTNTTTTAG AATATGGGAG TAAGATGAGG 7AGAGAAAAT 522 0 

AACCTGGTCT CACTGTGGTT GCCCTCATCC ACAATGTCCC CAAAGCCATC CTGCTMTGAT 5280 

GAGGACAATT TCCAGGTATA AGCAAGGGGC TTTGTGACAA AAATGTACCC 7GGCTGATGT 534 0 

TAAACATTGG CTCCTGTGTT TGCACCAAAA TAGCAAGCTG TGTGCTCTAT ACACTCTTCC 54 00 

CATCGTCTTG TGTACACTGC TCCTGTGGCC TTCCACAGCA GAAACCAGGG CAAAAGGGTC 54 60 

CAAACACATG GTTTTCCTTG CTGCAAGGCT NTTCCTGGGA ACTAAGGGGG 7ATTTATTAG 552 0 

TTCAGTTNTA AGAGACCTCC TTCTGGGCTT ACCCCACTCC TCAGGTACTT CTCTCTCCTT 5580 

CCTCCTTCTC CTCCACAGTC ACAAGTAACC AAGGAACCTG AAAGTGGATG 7GTAGCTATT 564 0 

TGAAGAAGGC AAGGAACCCT GAGATTCTTC TTTGAATCCT TTAGTCCAAG TCTTAGACCA 5700 

GTGATTGGTG CTTACCTTGA ACAAAATTTT GTCTGTGTTC CTAATCCCTT CAATACTNTG 5760 

GGTACAATGC TCCCAATCAC CCTGCACATT TGATTCTAAA TGGCTTTTAT TTTTTAAAAA 582 0 

TCCATATCCC TAGGACAAGA NAACAGGATG CCTATATCCC CAAAATGAGC TCCAGGACAC 5880 

TGATGGGAAT GATCCCAANG ATCACCCCAC CTCAGAAAAC GTCTGTGCCA ANAGACTTCC 594 0 

CCAGATAGAA NCACTGGGAC AGTGGTTTGA ACGACTTCTT TTATGGTTGT CCAGTTTGCT 6000 

ATGGAAATAA AAGGCATTGA TTTTTTAAAA AAGATGATTG GAACCTGTCT 7TGGCCACAT 6060 

AGGGCCACTT GGATCCATTT CCAGGCCTTA CTCATATATT GCCTTCACTG AAGGGCTTTG 612 0 

GCTTTAAGTC CCAGACTGGT CTCCCAAGTG AACCATAAGT GTTTTGGAGC TCATCTGGGG 6180 



TCCTAAAGCC TGGTCCCCAA AAATTGTTTT TGTCTCCAAA AGTCTAGTAT GGTCTTTATA 63 0 0 
CACCCANACT CTTAGTGTTG CGTCCTGCCT TGTTTCCTTG TTAAGGATCT A 
CCCGCTTTGG CTTAGCTAGC GTGACATTGG CTATCATTTG ACAAGAC 
TTTTTTTTTG ACTGAGTCTC CCTCTGTCAC CTAGGCTGGA GTGCAG- 

CTCGCTGCAA CCTTCACCCT TCACCTCCCA GGTCGAAGCG ATTCTCCTGC CTCAGTCTCC 
CGAGTAGCTG GGATTACAGG CGTGCGCCAC CAAATCTGGC TATTTTTTTA TTATTATTAT 
TTTTAGTAGA GATGGGGTTT CACCATGTTG GCCAGACTGG TCTTGAACTC TTGGCCTCAA 
ATTATCTGCC CACCTCGGCC TCCCAAAGTC CTGGCATTAC ACCCATCACC ACCATGCCCA 
GCTGACAAGA CTAATTTTTT ATCCCTTGGT TTATTGGCTT CAACATCTTC TGGAATCAGA 



TCTTTGACTT TTCTTTCTCT GTCTAGTTTC C 

AGTCTCTCTT TCCACAGTAC AAACATCCAT CCTTTCTCCT GTGCAATTCT GTCTCTCCCT 714 0 

CTTATTATCT TTATTTGTAC TTTTTCCTTC CTCCCTGTCT AGGCATTGGG CATGTGCCTC 72 00 

TTCTTAGCCT GTGATTTTGC CTTGGGACTG ATGATAAATT ATTTCCAGAT TCAATCAGCC 7260 

CTGGTCCTAC CCCAGTCCAA T CAGAAGT AT GTTGGTGGGG AATCAACCTG ATCCTGGCCC 7320 

TTTCTTCTTC TCCATTTTCA TTCGTAATCC CCCTCAGCAG ATCTTTACAA GCAGTTTCCT 7380 

TATAGCTCAT GTATCTTTAG GTCTTTGCCT TCCAAGCACT GTACAGAATA CTTTGTGGTT 744 0 

CCTTTTTAGT CTGACATTTT GTGGAGCAGT GAAGCGTGCT CAGAGACATA ATCAGCTGAA 7500 

GAGAAAAAAT CCACCCATGG ATTTATATCA GCTAAATACT AATAATTGAT TTTGTTTGAT 7560 

GTGCCCATAA TTTTTAAAGC TGCAATATAA TATAATGAGG GACCACAGGT AATTTCTCCT 762 0 

GTCATTTGTT TTGGCTGGAT GGGGGTGGGG GAGTAATTGC TTAAAGTTTT ACCATTACAC 7680 

ATTAAACTCT CTATAATAAT CTTGTTTGGG GCTTGCTAAC TG7TGAGCTG TTTIAACTAA 774 0 

ACTGGTAGGC AATCGGAGTT GATTTAAATG AAAAGATAAT TTAACAAATC TATACTATAA 780 0 

AAAGAGACAT TTGCTTAATT GACATGTATT TTTTCCTTCT GAGTCACCTA AACATTTACT 7860 

CTTGACACCA ACTGTTCATG ATACTGAATA GACAGTCCAT ATAAGAGAAA TTAGTGGACC 792 0 

TAAAGAAGCC AGATTGTAGG TGTTAATTTA TTAAACAGAA TTGCAAAGCC CTTGGAAATG 7980 

TCACTGCTTG GCAATACCAT ATGGCATGCC AAAATTTACA ATGACTTTTC TTTATAAGTT 804 0 

ATCCAAAAGG GATTTGAACA AGTAAGAGGT TATGCCAAAA TGTCTCCAAT GTATGGTCCT 8100 

GTAATATATT GCAGCTTGAA GCCAATGATC CCTTATGACT TGTATACAAC TAATGCATGT 8160 

TTTATTGAAT TTTGCATTTC CCACGTGTGG TAAGTCTTTA AAATGTTTTT GATCACCTTT 822 0 

NTGTGCCATT AAACTTGTAC AGAAAATGTT TTTATGGCCA TTTTCAAAGG GAGAAAGTTT 8280 

AAAATGGAAA CAGCCCACCC TTTCTGCCCT ATAGCTGTAG TTAGAAT7GA GTACCTGTAG 834 0 

CAAAACAGCT GTAATTGGTG GTTGTAGTGT TAGAGGTGTT AGCTTGCTAG TGACTAGCTT 84 0 0 

TGGAGAGTAA ATGCATGGTA TTGTACATCA CATTTCTTAA CTCGTTTTAA CCTCTGAAAA 84 60 

GAATATATTC TTCTTTGTAG TCCTTCTTCC CACCCCCTTG CCCTCTCCCT CTCCCTGCTC '852 0 

CCAGTTGTCT TACAGTTGTA AATATCTGAT TTGAGGCCCA ATAACTC7TG CCAAGTA^G 858 0 

TCAGCAAACA ACAAACAAAC CAAAATGTGG GGAAAAGGCA TTTCTCAACC ATCTC7CAGC 864 0 

AGTTATTGAT CATTTCTTAA GGAACAGCAT TGTGATCAAA GACTCAACTT TACGTAAAAA 8700 



AATCACATGT AATCCAAAGA CAGTAGGTAG TGATGTCCCT TATCCCTGCA GCTGTTTTAA 882 0 

GATAGAGACC TCAGAAGACT CTGCTTGACC GATGACCAAT AATTATTTGA AAAAAAAAGA 8880 

TATTTAA GAACTTTAGC CACCTAT7TA GAATAGTTAT 894 0 

3 GCATGAGTTC AAATGCATTA CTATCAGTGT CCTAGGCAAT 90 0 0 

h CTCTGAAATT GTGATTCAAA AGCAGTATTT CAAGAGGCAT TCTCCTTTTT 9060 

TGGTTTGCTG ACCCCACTTG GACTGGTAGG TTTGGTGAGG CCCCCATAAA CCAGCTGGAG 912 0 
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CAGACCCTTT TCATCTCCTG TGCCTGTAAC ACCCCTCTTC CCCCACCCCC TCCGCAATTC 9180 

AATGAGGGCT TTCTTGGGTC AGAGGACTT C AAGGTTGTCT AGAGAAGTTT GCCATGTGTG 9240 

TAAGGTGCTG TGAACTGTGA GTGCTGAAGA TTCGCAGCAT TCAATACCAG GCAGCCAAAG 9300 

AGCTGCTCTT GCAATTATTT TGGCTCTCAA GCTCTGTTCT TCATCGCATT CTCATTTCTG 93S0 
TGTACATTTG CAAGATGTGT GTAATGTCAT TTTCCAAAAA TAAAATTTGA TTTCAAT 



I I I I I I 

MELDFGHFDE RDKTSRNMRG SRMNGLPSPT HSAHCSFYRT RTLQALSNEK K" " 
GDRYFKGIVY AVSSDRFRSF DALLADLTRS LSDNIMLPQG VRYIYT] 
EGESYVCSSD NFFKKVEYTK NVNPNWSVNV KTSANMKAPQ SLASSNSAQA R 
LVTIIRSGVK PRKAVRVLLN KKTAHSFEQV LTDITEAIKL ETGWKKLYT LDGKQVTCLH 
DFFGDDDVFI ACGPEKFRYA QDDFSLDENE CRVMKGNPSA TAGPKASPTP QKTSAKSPGP 
MRRSKSPADS ANGTSSSQLS TPKSKQSPIS TPTSPGSLRK HKDLYLPLSL DDSDSLGDSM 



Seq ID NO: 215 DMA sequence 
Coding sequence: 312.. 644 

1 11 21 31 41 

GGCACGAGGC AGAGCTCTGC AAGGAGAGGT TGTGTCTTCG TTCTTTCCGC C 
CTTTCCAACA TCTTCGTTCT TTCTCACTGA CCGAGACTCA GCCGGTAGGT CTGCAGAGTG 
GTCTTCCTGG TAATTTAGTT GTGAGTGAAT GTGTGGAGGA GCCAGCGGGC TTAGGACAGG 
TCCTGTGGCA CAGTCCGTGG CTTTGAGGGA AAAGGGCCTC GCGGTGC 
CCCAGGTCGT GATGCAGGCG CCATGGGCCG GTAATCGTGG CTGGGCTGGA A 
AAGTGAGAGA TATGAGTGAG CATGTAACAA GATCCCAATC CTCAGAAAGA GGAAATGACC 
AAGAGTCTTC CCAGCCAGTT GGACCTGTGA TTGTCCAGCA GCCCACTGAG GAAAAACGTC 
AAGAAGAGGA ACCACCAACT GATAATCAGG GTATTGCACC TAGTGGGGAG AT CAAAAATG 
AAGGAGCACC TGCTGTTCAA GGGACTGATG TGGAAGCTTT TCAACAGGAA CTGGCTCTGC 
TTAAGATAGA GGATGCACCT GGAGATGGTC CTGATGTCAG GGAGGGGACT CTGCCCACTT 
TTGATCCCAC TAAAGTGCTG GAAGCAGGTG AAGGGCAACT ATAGGTTTAA ACCAAGACAA 
ATGAAGACTG AAACCAAGAA TATTGTTCTT ATGCTGGAAA TTTGACTGCT AACATTCTCT 
T TTACAGTTTT CTGCAAAAAA AAAAAAAAAA AAA 



Seq ID NO: 217 DNA sequence 
Nucleic Acid Accession # : NM_001476.1 
Coding sequence: 82.. 43 5 

1 11 21 31 41 

GCCAGGGAGC TGTGAGGCAG TGCTGTGTGG TTCCTGCCGT CCGGACTCTT T 
TGAGATTCAT CTGTGTGAAA TATGAGTTGG CGAGGAAGAT CGACCTATTA TTGGCCTAGA 
CCAAGGCGCT ATGTACAGCC TCCTGAAGTG ATTGGGCCTA TGCGGCCCGA GCAGTTCAGT 
GATGAAGTGG AACCAGCAAC ACCTGAAGAA GGGGAACCAG CAACTCAACG TCAGGATCCT 
GCAGCTGCTC AGGAGGGAGA GGATGAGGGA GCATCTGCAG GTCAAGGGCC GAAGCCTGAA 
GCTGATAGCC AGGAACAGGG TCACCCACAG ACTGGGTGTG AGTGTGAAGA TGGTCCTGAT 
GGGCAGGAGG TGGACCCGCC AAATCCAGAG GAGGTGAAAA CGCCTGAAGA AGGTGAAAAG 
CAATCACAGT GTTAAAAGAA GACACGTTGA AATGATGCAG GCTGCTCCTA TGTTGGAAAT 
: AATAAAGCTT TACAGCCTTC TGCAAAA 



Seq ID NO: 219 DNA sequence 
Nucleic Acid Accession #: NM_001476 
Coding sequence: 90-3671 

1 11 21 31 41 51 

I I I I I I 

ACAGCGGAGC GCAGAGTGAG AACCACCAAC CGAGGCGCCG GGCAGCC 
AGACAGAGAC TGAGCGGCCC GGCACCGCCA TGCCTGCGCT CTGGCTGGGC TGCTGCCTCT 
GCTTCTCGCT CCTCCTGCCC GCAGCCCGGG CCACCTCCAG GAGGGAAGTC TGTGATTGCA 
ATGGGAAGTC CAGGCAGTGT ATCTTTGATC GGGAACTTCA CAGACAAACT GGTAATGGAT 
TCCGCTGCCT CAACTGCAAT GACAACACTG ATGGCATTCA CTGCGAGAAG TGCAAGAATG 
GCTTTTACCG GCACAGAGAA AGGGACCGCT GTTTGCCCTG CAATTGTAAC TCCAAAGGTT 
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CTCTTAGTGC TCGATGTGAC AACTCTGGAC GGTGCAGCTG TAAACCAGGT GTGACAGGAG 420 

CCAGATGCGA CCGATGTCTG CCAGGCTTCC ACATGCTCAC GGATGCGGGG TGCACCCAAG 480 

ACCAGAGACT GCTAGACTCC AAGTGTGACT GTGACCCAGC TGGCA7CGCA GGGCCCIGTG 540 

ACGCGGGCCG CTGTGTCTGC AAGCCAGCTG TTACTGGAGA ACGCTGTGAT AGGTGTCGAT S00 

CAGGTTACTA TAATCTGGAT GGGGGGAACC CTGAGGGCTG TACCCAGTGT TTCTGCTATG S60 

GGCATTCAGC CAGCTGCCGC AGCTCTGCAG AATACAGTGT CCATAAGATC ACCTCTACCT 720 

TTCATCAAGA TGTTGATGGC TGGAAGGCTG TCCAACGAAA TGGGTCTCCT GCAAAGCTCC 780 

AATGGTCACA GCGCCATCAA GATGTGTTTA GCTCAGCCCA ACGAC7AGAC CCTGTCTATT 840 

TTGACTACCG TGTGGACAGA GGAGGCAGAC ACCCATCTGC CCATGATGTG ATTCTGGAAG 960 

GTGCTGGTCT ACGGATCACA GCTCCCTTGA TGCCACTTGG CAAGACACTG CCTTGTGGGC 1020 

TCACCAAGAC TTACACATTC AGGTTAAATG AGCATCCAAG CAATAATTGG AGCCCCCAGC 1080 

TGAGTTACTT TGAGT AT CGA AGGTTACTGC GGAATCTCAC AGCCC7CCGC ATCCGAGCTA 1140 

CATATGGAGA ATACAGTACT GGGTACATTG ACAATGTGAC CCTGATTTCA GCCCGCCCTG 1200 

TCTCTGGAGC CCCAGCACCC TGGGTTGAAC AGTGTATATG TCCTGTTGGG TACAAGGGGC 1260 

AATTCTGCCA GGATTGTGCT TCTGGCTACA AGAGAGATTC AGCGAGACTG GGGCCTTTTG 1320 

GCACCTGTAT TCCTTGTAAC TGTCAAGGGG GAGGGGCCTG TGATCCAGAC ACAGGAGATT 1380 

GTTATTCAGG GGATGAGAAT CCTGACATTG AGTGTGCTGA CTGCCCAATT GGTTTCTACA 1440 

ACGATCCGCA CGACCCCCGC AGCTGCAAGC CATGTCCCTG TCATAACGGG TTCAGCTGCT 1500 

CAGTGATGCC GGAGACGGAG GAGGTGGTGT GCAATAACTG CCCTCCCGGG GTCACCGGTG 1560 

CCCGCTGTGA GCTCTGTGCT GATGGCTACT TTGGGGACCC CTTTGGTGAA CATGGCCCAG 162 0 

TGAGGCCTTG TCAGCCCTGT CAATGCAACA ACAATGTGGA CCCCAGTGCC TCTGGGAATT 1680 

GTGACCGGCT GACAGGCAGG TGTTTGAAGT GTATCCACAA CACAGCCGGC ATCTACTGCG 1740 

ACCAGTGCAA AGCAGGCTAC TTCGGGGACC CATTGGCTCC CAACCCAGCA GACAAGTGTC 1800 

GAGCTTGCAA CTGTAACCCC ATGGGCTCAG AGCCTGTAGG ATGTCGAAGT GATGGCACCT 1860 

GTGTTTGCAA GCCAGGATTT GGTGGCCCCA ACTGTGAGCA 7GGAGCATTC AGCTGTCCAG 192 0 

CTTGCTATAA TCAAGTGAAG ATTCAGATGG ATCAGTTTAT GCAGCAGCTT CAGAGAATGG 1980 

AGGCCCTGAT TTCAAAGGCT CAGGGTGGTG ATGGAGTAGT ACCTGATACA GAGCTGGAAG 2 04 0 

GCAGGATGCA GCAGGCTGAG CAGGCCCTTC AGGACATTCT GAGAGATGCC CAGATTTCAG 210 0 

AAGGTGCTAG CAGATCCCTT GGTCTCCAGT TGGCCAAGGT GAGGAGCCAA GAGAACAGCT 2160 

ACCAGAGCCG CCTGGATGAC CTCAAGATGA CTGTGGAAAG AGTTCGGGCT CTGGGAAGTC 222 0 

AGTACCAGAA CCGAGTTCGG GATACTCACA GGCTCATCAC TCAGATGCAG CTGAGCCTGG 2280 

CAGAAAGTGA AGCTTCCTTG GGAAACACTA ACATTCCTGC CTCAGACCAC TACGTGGGGC 234 0 

CAAATGGCTT TAAAAGTCTG GCTCAGGAGG CCACAAGATT AGCAGAAAGC CACGTTGAGT 2400 

CAGCCAGTAA CATGGAGCAA CTGACAAGGG AAACTGAGGA CTATTCCAAA CAAGCCCTCT 2460 

CACTGGTGCG CAAGGCCCTG CATGAAGGAG TCGGAAGCGG AAGCGGTAGC CCGGACGGTG 252 0 

CTGTGGTGCA AGGGCTTGTG GAAAAATTGG AGAAAACCAA GTCCCTGGCC CAGCAGTTGA 2580 

CAAGGGAGGC CACTCAAGCG GAAATTGAAG CAGATAGGTC TTATCAGCAC AGTCTCCGCC 264 0 

TCCTGGATTC AGTGTCTCGG CTTCAGGGAG TCAGTGATCA GTCCTTTCAG GTGGAAGAAG 2700 

CAAAGAGGAT CAAACAAAAA GCGGATTCAC TCTCAACGCT GGTAACCAGG CATATGGATG 2760 

AGTTCAAGCG TACACAAAAG AATCTGGGAA ACTGGAAAGA AGAAGCACAG CAGCTCTTAC 282 0 

AGAATGGAAA AAGTGGGAGA GAGAAATCAG ATCAGCTGCT TTCCCGTGCC AATCTTGCTA 2880 

ARAGCAGAGC ACAAGAAGCA CTGAGTATGG CCAATGCCAC TTTTTATGAA GTTGAGAGCA 2 94 0 

TCCTTAAAAA CCTCAGAGAG TTTGACCTGC AGGTGGACAA CAGAAAAGCA GAAGCTGAAG 3000 

AAGCCATGAA GAGACTCTCC TACATCAGCC AGAAGGTTTC AGATGCCAGT GACAAGACCC 30 60 

AGCAAGCAGA AAGAGCCCTG GGGAGCGCTG CTGCTGATGC ACAGAGGGCA AAGAATGGGG 312 0 

CCGGGGAGGC CCTGGAAATC TCCAGTGAGA TTGAACAGGA GATTGGGAGT C7CAAC7TCC 3180 

AAGCCAATGT GACAGCAGAT GGAGCCTTGG CCATGGAAAA GGGACTGGCC TCTCTGAAGA 3240 

GTGAGATGAG GGAACTCGAA GCACACCTGC AAAGGAAGGA GCTGGAGT7T GACACGAATA 33 00 

TGGATGCAGT ACAGATGGTG ATTACAGAAG CCCAGAAGGT TGA7ACCAGA GCCAAGAACG 3 3 60 

CTGGGGTTAC AATCCAAGAC ACACTCAACA CATTAGACGG CCTCCTGCAT CTGATGGACC 3420 

AGCCTCTCAG TGTACATGAA GACCCCCTGG TCTTACTGGA GCAGAAGCTT TCCCGAGCCA 3480 

AGACCCAGAT CAACAGC CAA CTGCGGCCCA TGATGTCAGA GCTGGAAGAG AGGGCACGTC 3540 

AGCAGAGGGG CCACCTCCAT TTGCTGGAGA CAAGCATAGA TGGGATTCTG GCTGATGTGA 3600 

AGAACTTGGA GAACATTAGG GACAACCTGC CCCCAGGCTG CTACAATACC CAGGCTCTTG 3660 

AGCAACAGTG AAGCTGCCAT AAATATTTCT CAACTGAGGT TCTTGGGATA CAGATCTCAG 3720 

GGCTCGGGAG CCATGTCATG TGAGTGGGTG GGATGGGGAC ATTTGAACAT GTTTAATGGG 3780 

TATGCTCAGG TCAACTGACC TGACCCCATT CCTGATCCCA TGGCCAGGTG GTTGTCTTAT 3840 

TGCACCATAC TCCTTGCTTC CTGATGCTGG GCAATGAGGC AGATAGCACT GGGTGTGAGA 3900 

ATGATCAAGG ATCTGGACCC CAAAGAATAG ACTGGATGGA AAGACAAACT GCACAGGCAG 3960 

ATGTTTGCCT CATAATAGTC GTAAGTGGAG TCCTGGAATT TGGACAAGTG CTGTTGGGAT 402 0 

ATAGTCAACT TATTCTTTGA GTAATGTGAC TAAAGGAAAA AACTTTGACT TTGCCCAGGC 4080 

ATGAAATTCT TCCTAATGTC AGAACAGAGT GCAACCCAGT CACACTGTGG CCAGTAAAAT 414 0 

ACTATTGCCT CATATTGTCC TCTGCAAGCT TCTTGCTGAT CAGAGTTCCT CCTACTTACA 4200 

ACCCAGGGTG TGAACATGTT CTCCATTTTC AAGCTGGAAG AAGTGAGCAG TGTTGGAGTG 4260 

AGGACCTGTA AGGCAGGCCC ATTCAGAGCT ATGGTGCTTG CTGGTGCCTG CCACCTTCAA 4320 

GTTCTGGACC TGGGCATGAC ATCCTTTCTT TTAATGATGC CATGGCAACT TAGAGATTGC 4380 

ATTTTTATTA AAGCATTTCC TACCAGCAAA GCAAATGTTG GGAAAGTATT TACTTTTTCG 4440 

GTTTCAAAGT GATAGAAAAG TGTGGCTTGG GCATTGAAAG AGGTAAAATT CTCTAGATTT 4500 

ATTAGTCCTA ATTCAATCCT ACTTTTCGAA CACCAAAAAT GATGCGCATC AATGTATTTT 4560 

ATCTTATTTT CTCAATCTCC TCTCTCTTTC CTCCACCCAT AATAAGAGAA TGTTCCTACT 4620 

CACACTTCAG CTGGGTCACA TCCATCCCTC CATTCATCCT TCCATCCATC TTTCCATCCA 4680 

TTACCTCCAT CCATCCTTCC AACATATATT TATTGAGTAC CTACTGTGTG CCAGGGGCTG 4740 

GTGGGACAGT GGTGACATAG TCTCTGCCCT CATAGAGTTG ATTGTCTAGT GAGGAAGACA 4800 

AGCATTTTTA AAAAATAAAT TTAAACTTAC AAACTTTGTT TGTCACAAGT GGTGTTTATT 4860 

GCAATAACCG CTTGGTTTGC AACCTCTTTG CTCAACAGAA CATATGTTGC AAGACCCTCC 4920 

CATGGGGGCA CTTGAGTTTT GGCAAGGCTG ACAGAGCTCT GGGTTGTGCA CATTTCTTTG 4980 

CATTCCAGCT GTCACTCTGT GCCTTTCTAC AACTGATTGC AACAGAC-GT TGAGT7ATGA 5040 

T GGGAATTGCT GGAGGAACCA GAGGCACTTC CACCTTGGCT GGGAAGACTA 5100 

ATTTTCCTGA AAGTGTTTTT AAATAAAGAA 5160 



CAATTGTTAG ATGCC 
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MPALWLGCCL CFSLLLPAAR 
DGIHCEKCKN GFYRHRERDR 
HMLTDAGCTQ DQRLLDSKCD 
PEGCTQCPCY GHSASCRSSA 
SSAQRLDPVY FVAPAKFLGN 
MPLGKTLPCG LTKTYTFRLN 
P VSGAPAPWVE 



ATSRREVCDC NGKSRQCIFD R3LHRQTGNG FRCLNODNT 



VTGERCDRCR SGYYKLDGGN 
VQRNGSFAK1 QWSQRHQDVP 
HPSAHDVIL3 GAGLRITAFL 
RNLTALRIRA TYGEYSTGYI 
KRDSARLGPF GTCIPCNCQG 



CNNCPPGVTG A 



CDPAGIAGPC DAGRCVCKPA 
EYSVHKITST FHQDVDGWKA 
QQVSYGQSLS FDYRVDRGGR 
EHPSNNWSPQ LSYFEYRRLL 
QCICPVGYKG QFCQDCASGY 
ECADCPIGFY NDPHDPRSCK 



QDILRDAQI S 
RLITQMQLSL 
ETEDYSKQAL 
ADRSYQHSLR 
NWKEEAQQLL 
QVDNRKAEAE 
IEQEIGSLNL 
AQKVDTRAKN 
MMSELEERAR 

Seq ID NO: 



EGASRSLGLQ 



SLVRKALHEG 
LLDSVSRLQG 
QNGKSGREKS 
EAMKRLSYIS 



VGSGSGSPDG 
VSDQSFQVEE 
DQLLSRANLA 
QKVSDASDKT 



EALISKAQGG 
YQSRLDDLKM 
PNGFKSLAQE 
AWOGLVEKL 



NHVDPSASGN 
EPVGCRSDGT 
DGWPDTELE 



AADAQRAKNG 
ERKBL3FDTN 
VLLEQKLSRA 



V KNLENIRDNL PPGCYNTOAL EQQ 



TREATQA3IE 
EFKRTQKNLG 
ILKNLREFDL 
AGEALEISEE 
MDAVQMVITE 
KTQINSQLRP 
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GTCAAGAAAA 



I 



30 
35 
40 
45 
50 
55 
60 
65 



ACATTATGCC 
GCTGATCTCT 
ATATTGAAGG 
CTGCTACTTG 
GCAACACTGT 
GCGATTAATA 
AAGGAGGACT 
AATTTGCTGG 



ATATGCTGCA 



TACGCCATCG 



ATTCAGCTTT 
TGCATCGGCC 
GAGAGGTCTT 
AATGGCGAAG 



GGTCATGCTA 
GTTTGTCTGA 
TGGGGAAGCA 



ATAATGTGAT 
ATCTGGAATA 
CTGAGAATGA 
ACAGAGCTCA 
GAGCCACAGC 
TGAAGGCAGA 
TAGGGTATTC 
CTTTGGATGC 
GCAAGGAAAA 
TCGAAGTCCG 
GAGTGTCTCC 
TCACCCTCGC 
CTGTGGGAAT 
CACAGTTTTC 
TGACCAAGTG 
GGTTCGCCTT 
TGTACAATGT 
GCACTCAGGA 
GCTTCAACAC 
TCTTCTGGTT 



AATTGTTCGA 



CTTTGCCACG 
GTATGAGGAG 
ACGGTTGGAA 
CATAGAAGAT 



ACTCCTTCAG 
CTTTCAAAAG 
GAAGGCTTGC 
TGGCTGAAAG 
GAGTGTTACG 



CTGCCGATTG 
CACAAGGGCA 
TGACGTGGCC 
GAGGAGTTTC 
TCTGCAGAAG 
CATCGGAGAC 
CAGTGGGAAT 
CTACTTAGAG 
CATCTTGTAC 



TGGGTGTTGA 
GTATCGCAGA 
GCCATTACTC 
CTCATCATCG 
CTGGATTTGG 
TCTGAGATAG 



ACTCAAAATA TATGGAGGAA 
GGACTCTCTG TGTGGCTTAT 
TCTATCAGGA AGCCAGCACC 
AGATCATTGA GAAGAATTTG 
CAGGAGTTCC AGAAACCATC 
CAGGAGACAA ACAAGAAACT 
ATATGGCCCT TATCCTATTG 
TGACCTTGGG 



CACTCTCGTG 



GAGCATGCTC 
AAAGGTTTTO 
TCCCATGAAA 



GAAGGCATGC 
AAGCTTCTGT 
TGCTTCTATA 
TTTTCTGGGC 
GCTTTGCCGC 
AGGTTTCCCC 
TGGGGTCACT 
GCTCTGGAGC 



ATGTCGGGAT 
AGGCCACCAA 
TGG7TCATCC 
AGAACGTGGT 



GCCAAGCACA 



AAGCTGGTTT 
TGCTGACCTG 
CTCCAGATAT 
TATTTCTGGT 



ATTTGTTGGA AATATTG 



GAAAAGCGGT 



GCTGGTGTTT 
GAGAGGACAG 
TCCTACTGCC 
GACATTGCTG 
GCTGCGGGAT 



JCTTGGA 1 k 
TTTGGCATCT 
GCAACTATGG 



CCT7CACTCT 
AGCTCTACAA 
GCATCAACGC 
ATGATACTGT 
ACACATATGT 



CAAAGCGGTC 
GAAGAAGCGG 
GATCCAGACA 
CAACTCGGAT 
ACCCTCCACC 

TGAACGTTGG 
GGGAATCTTT 



ACTCGACCAT 
TCCTGAGCTC 
AAGATGTGGC 
AGGAGCTGGA 



CGCCTGATCA AGAGGCTGGG CCGGAAGACG CCCCCGACGC 



CTTGGTCCAC 
G7TTGACAGT 
TGTTGTTACT 
TCTGGCTGTC 
CTGGCCCACC 
CGCACACTTC 
ATGGAGAGCA 
AACCAAGTCT 
CGAGCGCGAC 
CAGCTCCCTG 



CCAGGAAGAA ATAAGACATG 



CAGCAGGGCG TCCCGCATGG GTATGCTTTT TCTCAAGAAG 

GAAGAAGTCA TCCGTGCTTA TGACACCACC 

AATTTTCCTG ACTGATCTTA GGAAAGAGAT 

TTTGTCAGAG AAGACT GGCG TCCAAGGCCA AAACACCAGG A 

AGTTAAGCAG TTTGTTAGTT ACATATTCCC TCGCAAACCT GGAGTGCAGA CCACAGGGGA : 

AGCTATCTTT GCCCTCCCAA CTCGTCTGCA GTGCTTAGCC TAACTTTTGT T7ATGTCGTT : 

ATGAAGCATT CAACTGTGCT CTGTGAGGTC TCAAATTAAA AACATTATGT TTCACCAATA 



MSVIVRTPSG 



I 



I I I I 

RLRLYCKGAD NVIFERLSKD SKYMEETLCK LEYFATEGLR T. 
YQEASTILKD RAQRLEECYE I I EKNLLLLG ATAIEDRLQA GVPETIATLL 
GDKQETAINI GYSCRLVSQN MALILLKEDS LDATRAAITQ HCTDLGNLLG 
GHTLKYALSF EVRRSFLDLA LSCKAVICCR VSPLQKSEIV DWKKRVKAI 
VGMIQTAHVG VGISGNEGMQ ATNNSDYAIA QFSYLEKLLL VHGAWSYNRV 
NWLYIIELW FAFVNGFSGQ I LFERWCIGL YNVIFTALPP F7LGIFERSC 
LYKITQNGEG FNTKVFWGHC INALVHSLIL FWFPMKALEH DTVFDSGHAT 
TYWVTVCLK AGLETTAWTK FSHLAVWGSM LTWLVFFGIY STIKPTIPIA 
LSSAHFWLGL FLVPTACLIE DVAWRAAKHT CKKTLLEEVQ ELETKSRVLG 
RLNERDRLIK RLGRKTPPTL FRGSSLQQGV PHGYAFSQEE HGAVSQ3EVI 



KAEIKIWVLT 
KENDVALIID 
TLAIGDGAND 
TKCILYCFYK 
TQESMLRFPQ 
DYLFVGNIVY 
PDMRGQATMV 
KAVLRDSNGK 
RAYDTTKKKS 



Seq ID NO: 223 DNA sequence 
Nucleic Acid Accession #: BC017001 
Coding sequence: 1-394 
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| | I I I I 

AACGCTGGGC AGGGCCGGCG CGGGTCGGGG GGCGCCCGAG GGGCCCGGGC CGAGCGGCGG 60 

CGCGCAGGGC GGCAGCATCC ACTCGGGCCG CATCGCCGCG GTGCACAACG TGCCGCTGAG 12 0 

CGTGCTCATC CGGCCGCTGC CGTCCGTGTT GGACCCCGCC AAGGTGCAGA GCCTCGTGGA 180 

CACGATCCGG GAGGACCCAG ACAGCGTGCC CCCCATCGAT GTCCTCTGGA TCAAAGGGGC 240 

CCAGGGAGGT GACTACTTCT ACTCCTTTGG GGGCTGCCAC CGCTACGCGG CCTACCAGCA 3 00 

ACTGCAGCGA GAGACCATCC CCGCCAAGCT TGTCCAGTCC ACTCTCTCAG ACCTAAGGGT 3 SO 

GTACCTGGGA GCATCCACAC CAGACTTGCA GTAGCAGCCT CCTTGGCACC TGCTGCCACC 420 

TTCAAGAGCC CAGAAGACAC ACCTGGCCTC CAGCAGGCTG GGCCATGCAG AAGGGATAGC 480 

AGGGGTGCAT TCTCT1TGCA CCTGGCGAGA GGGTCTGACT CTGGGCACCC CICTCACCGG 540 

CTACAAGGCC TTGGACTCAC TGTACAGTGT GGGAGCCCCA GTTCCCACCT CTGTGACAAT 600 

AGGATCATGG CCTTACCCTT GAAGCATTAC CGAGAAGGAG AACAGAGATG GGCTTGAAGA 660 

GCCACGTGCT GCCGGCTCCA AATTCCCAAG GACAAGGATC CCTCTGCATT TTTGTCTATG 720 

TAACCTCTTA TATGGACTAC ATTCAGCTGC AAGGAAAGGA AAACCTTGAT TGCAGTGGTT 780 

TAAACAAACA GAAGATTGTT TTTCCACATA GCATGGATTC TGGAGATGGG TGGCTAATGG 840 

TATTGGTTCA ACAACTCCAC GGAGGTAGGG GTCACGTCTT GGATCCTTTT GGCTTAATCT 900 

CAGTGCTCGT TACTTCATGG TCCCAAGATG GCTGCTGTAT CCCCAAGAAT CATGTCTGCG 960 

TTCAAGGAAG GAGGGGTGGA GGAAGAGGAA GGGCCAAACT AGCTGGACCC GTCACCTTCT 1020 

ATCAGAAAGT AAAACCTCGT CAGAAGTCTG TTTCCTGCTC TCTCCCTCTG CATATCTTCA 1080 

CTTAGATGCC CTTGGCCCGA GCCAGCTACC ATTGCACCTC TAGCTGCAAA CAAAGCTAAG 1140 

ACAGCAGGGA ACAGAATTGT CATGGCTGAA TAGACCAATC GTGTTCCATC TACTGAGACT 1200 

GGCACACTGC CTCCTGCAAT AAAACTGGGA TCCCATTACC AAGAGAGAAA TGCAGAATTG 1260 

TGTACCAGTT AGCTTTTGCT GTGTAACAAA CCATCCCCAA ACTTGGCAGC TAGAAACAAA 1320 

CCCTGTATTT TCCCACAATC CTATGGGTTG GCAATTTGGG CTGGGCTCAA CAGGGCAGTT 1380 

CTGCTGCTCA CACCTGGGAT CCCTCATGGA GCTAAGGTCA GCTGTTACCT CAGCTGGGCC 1440 

TGGATGGTCT AGGATAGCCT TACTCACTTG CCTGGCAGGT GACAGGCTG7 TGGCTGGAAT 1500 

TGCTTGGTTC TCCTCCATGT GGCCTCTCCA GCAGGCTAGC TCAGGCTTAT TCACATGATG 1560 

GCTTCAGGAT TCCAAAGAGA GTGAGAGTAG AAGCTGAAAG ACTTCTTGAG TTCTTGGCCT 1620 

GGAACTGGGA CTAGGACAGT GTCACTTCTG CTAAGTTCTT TTGGTCAGAG CAAATCACAA 1680 

GGCTTTACCC AGATTCAAGG GATGAGAAAC AGACTACATG TCTTGATGAG GGGAACCACA 174 0 

AAGAGCTTGT GGCCATTTTT CACCTATCAC AAATAATTTT GGATGGGTAT TTATTTGGAT 1800 

AAAGGTATTT CCCTCTTCCC CCTTTCTCTC TGTCTCATGG GGCCTCACTC TGCCAAGTTG 1860 

GAAGGCACTA AGACATTGTC CTGGCCCTCA GGGTCTAGGG GAAGAGGTGT TGGGGCAGGA 192 0 

AGTGAGTCTC TCCATGGGCT GGACCCACTG TAGTAGGAGT GCCTCCTTGT CTGCACTGCT 1980 

GGTATGGGGT TAGGCCAGGT AGGACATTCC AGAGGGGCTT CTGAAAACCA AGAGTCCCTG 204 0 

GGGAAAGGGA ACAGAGTAAG GCAGGCCTTG TTCTCACTGC CCTCTAAGGG AACTTGGTCA 210 0 

CTCGGCACTT TTAAGCCTCA GTTTCTCCAG TTCAATAATA AGGACAAGAG CTTTTCCCAT 2160 

GCATTCTCTT TCCCCGGGAA AGTTGACTGA GGTGACCAGT AATAGAATTG AAAAGGGAGA 222 0 

GTGTCTTCAG TGCAATGTGG CATCCTGGAT TGGGTCTTGG AACAAAAACA GGACATTAGT 22 80 

GGGAAAATTG GAAATCTGAA AAAAGTCTGA ATTTTAGTTA ATA7ACCAAT TTCAGTCTCT. 2340 

TCCTTTTGAC AGATGTACCA TGGTGATGTA AGATGTTGAC CTTGGGGTAG GCTGGGTGAA 24 00 

GGGTATACAG GAACTCTTTG TACTATCTCT GCAACTTCTC TGTAAATCTA GTATCATTCC 2460 

- - A AAAAAAAAAA AA 



TLGRACAGRC APEGPGPSGG AQGGSIHSGR IAAVHNVPLS VLIRPLPSVL DPAKVQSLVD 
TIREDPDSVP PIDVLWIKGA QGGDYFYSFG GCHRYAAYQQ LORETIPAKL VQSTLSDLRV 
YLGASTPDLQ 

Seq ID NO: 225 DNA sequence 
Nucleic Acid Accession ft: NM_021048 
- . .1110 



60 1 11 21 31 41 51 

ATGCCTCGAG CTCCAAAGCG TCAGCGCTGC ATGCCTGAAG AAGATCTTCA ATCCCAAAGT 60 

GAGACACAGG GCCTCGAGGG TGCACAGGCT CCCCTGGCTG TGGAGGAGGA TGCTTCATCA 120 

TCCACTTCCA CCAGCTCCTC TTTTCCATCC TCTTTTCCCT CCTCCTCCTC TTCCTCCTCC 180 

65 TCCTCCTGCT ATCCTCTAAT ACCAAGCACC CCAGAGGAGG TTTCTGCTGA TGATGAGACA 240 

CCAAATCCTC CCCAGAGTGC TCAGATAGCC TGCTCCTCCC CCTCGGTCGT TGCTTCCC7T 300 

CCATTAGATC AATCTGATGA GGGCTCCAGC AGCCAAAAGG AGGAGAGTCC AAGCACCCTA 350 

CAGGTCCTGC CAGACAGTGA GTCTTTACCC AGAAGTGAGA TAGATGAAAA GGTGACTGAT 420 

TTGGTGCAGT TTCTGCTCTT CAAGTATCAA ATGAAGGAGC CGATCACAAA GGCAGAAATA 4 80 

70 CTGGAGAGTG TCATAAAAAA TTATGAAGAC CACTTCCCTT TGTTGTTTAG TGAAGCCTCC 540 

GAGTGCATGC TGCTGGTCTT TGGCATTGAT GTAAAGGAAG TGGATCCCAC TGGCCACTCC 600 

TTTGTCCTTG TCACCTCCCT GGGCCTCACC TATGATGGGA TGCTGAGTGA TGTCCAGAGC 660 

ATGCCCAAGA CTGGCATTCT CATACTTATC CTAAGCATAA TCTTCATAGA GGGCTACTGC 720 

ACCCCTGAGG AGGTCATCTG GGAAG CACTG AATATGATGG GGCTGTATGA TGGGAIGGAG 780 

75 CACCTCATTT ATGGGGAGCC CAGGAAGCTG CTCACCCAAG ATTGGGTGCA GGAAAACTAC 840 

CTGGAGTACC GGCAGGTGCC TGGCAGTGAT CCTGCACGGT ATGAGTTTCT GTGGGGTCCA 900 

AGGGCTCATG CTGAAATTAG GAAGATGAGT CTCCTGAAAT TTTTGGCCAA GGTAAATGGG 960 

AGTGATCCAA GATCCTTCCC ACTGTGGTAT GAGGAGGCTT TGAAAGATGA GGAAGAGAGA 1020 

GCCCAGGACA GAATTGCCAC CACAGATGAT ACTACTGCCA TGGCCAGTGC AAGTTCTAGC 1080 

80 GCTACAGGTA GCTTCTCCTA CCCTGAATAA 

Protein Accession #: NP_066386 



MPRAPKRORC MPEEDLQSQS ETQGLEGAQA PLAVEEDASS STSTSSSFPS SFPSSSSSSS 60 
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SSCYPLIPST PEEVSADDET PNPPQSAQIA CSSPSWASL PLDQSDEGSS SQK2ESPSTL 12 0 

QVLPDSESLP RSEIDEKVTD LVQFLLPKYQ MKEPITKAEI L3SVIKNYED HFPLLFSEAS 180 

ECMLLVFGID VKEVDPTGHS FVLVTSLGLT YDGMLSDVQS MPKTGILILI LSIIFIEGYC 240 

TPEEVIWEAL NMMGLYDGME HLIYGEPRKL LTQDWVQENY LEYRQVPGSD FARYEFLWGP 300 

RAHAEIRKMS LLKFLAKVNG SDPRSFPLWY EEALKDEEER AQDRIATTDD TTAMASASSS 3S0 
ATGSFSYPE 



T GGTTCTGCAA 

AGTATGGCTA CAGGGGCCAC TTTCCCTGAG GAAGCCATTG CTGACTTGTC AG7GAATATG 
TATAATCGTC TTAGAGCCAC TGGTGAAGAT GAAAATATTC TCITCTCTCC ATTGAGTATT 
GCTCTTGCAA TGGGAATGAT GGAACTTGGG GCCCAAGGAT CTACCCAGAA AGAAATCCGC 
CACTCAATGG GATATGACAG CCTAAAAAAT GGTGAAGAAT TTTCTTTCTT GAAGGAGTTT 
TCAAACATGG TAACTGCTAA AGAGAGCCAA TATGTGATGA AAATTGCCAA 
GTGCAAAATG GATTTCATGT CAATGAGGAG TTTTTGCAAA TGATGAAAAA 

GCTGCCACTT ATCTGGCCCT CATTAATGCT GTCTATTTCA AGGGGAACTG GAAGTCGCAG SSO 

TTTAGGCCTG AAAATACTAG AACCTTTTCT TTCACTAAAG ATGATGAAAG TGAAGTCCAA 72 0 

ATTCCAATGA TGTATCAGCA AGGAGAATTT TATTATGGGG AATTTAGTGA IGGCTCCAAT 78 0 

GAAGCTGGTG GTATCTACCA AGTCCTAGAA ATACCATATG AAGGAGATGA AATAAGCATG 84 0 

ATGCTGGTGC TGTCCAGACA GGAAGTTCCT CTTGCTACTC TGGAGCCATT AGTCAAAGCA 900 

CAGCTGGTTG AAGAATGGGC AAACTCTGTG AAGAAGCAAA AAGTAGAAGT ATACCTGCCC 960 

AGGTTCACAG TGGAACAGGA AATTGATTTA AAAGATGTTT TGAAGGCTCT TGGAATAACT 1020 

GAAATTTTCA TCAAAGATGC AAATTTGACA GGCCTCTCTG ATAATAAGGA GATTTTTCTT 1080 

TCCAAAGCAA TTCACAAGTC CTTCCTAGAG GTTAATGAAG AAGGCTCAGA AGCTGCTGCT 114 0 

GTCTCAGGAA TGATTGCAAT TAGTAGGATG GCTGTGCTGT ATCCTCAAGT TATTGTCGAC 1200 

CATCCATTTT TCTTTCTTAT CAGAAACAGG AGAACTGGTA CAATTCTATT CATGGGACGA 1260 

GTCATGCATC CTGAAACAAT GAACACAAGT GGACATGATT TCGAAGAACT TTAAGTTACT 1320 

TTATTTGAAT AACAAGGAAA ACAGTAACTA AGCACATTAT GTTTGCAACT GGTATATATT 138 0 

TAGGATTTGT GTTTTACAGT ATATCTTAAG ATAATATTTA AAATAGTTCC AGATAAAAAC 144 0 

AATATATGTA AATTATAAGT AACTTGTCAA GGAATGTTAT CAGTATTAAG CTAATGGTCC 1500 
TGTTATGTCA TTGTGTTTGT GTGCTGTTGT TTAAAATAAA AGTACCTATT GAACATGTG 



1 11 21 31 41 

45 | I i i i 

MAFLGLFSLL VLQSMATCAT FPEEAIA 
ELGAQGSTOK EIRHSMGYDS LKNGEEFSFL K 
NEEFLQMMKK YFNAAVNHVD FSQNVAVANY I 
INAVYFKGNW KSQFRPENTR TFSFTKDDES EVQIPMMYQQ GEFYYGEFSD GSNEAGGIYQ 
50 VLEIPYEGDE ISMMLVLSRQ EVPLATLEPL VKAQLVEEWA NSVKKQKVBV YLPRFTVEQE 
IDLKDVLKAL GITEIFIKDA NLTGLSDNKE IFLSKAIHKS FLEVNEEGSE AAAVSGMIAI 
SRMAVLYPQV IVDHPFFFLI RNRRTGTILF MGRVMHPETM N7SGHDFEEL 



Seq ID NO: 229 DNA sequen 



CGACATCAGA GATGAGGACA GCATTGCTGC TCCTTGCAGC CCTGGCTGTG GCTACAGGGC 



ATCTGGTGAA GAAGGACTGT GCGGAGTCGT GCACACCCAG CTACACCCTG CAAGGCCAGG 
TCAGCAGCGG CACCAGCTCC ACCCAGTGCT GCCAGGAGGA CCTGTGCAAT GAGAAGCTGC 
ACAACGCTGC ACCCACCCGC ACCGCCCTCG CCCACAGTGC CCTCAGCCTG GGGCTGGCCC 
TGAGCCTCCT GGCCGTCATC TTAGCCCCCA GCCTGTGACC TTCCCCCCAG GGAAGGCCCC 
TCATGCCTTT CCTTCCCTTT CTCTGGGGAT TCCACACCTC TCTTCCCCAG CCGGCAACGG 
GGGTGCCAGG AGCCCCAGGC TGAGGGCTTC CCCGAAAGTC TGGGACCAGG TCCAGGTGGG 
CATGGAATGC TGATGACTTG GAGCAGGCCC CACAGACCCC ACA3AGGATG AAGCCACCCC 
ACAGAGGATG CAGCCCCCAG CTGCATGGAA GGTGGAGGAC AGAAGCCCTG TGGATCCCCG 
GATTTCACAC TCCTTCTGTT TTGTTGCCGT TTATTTTGTA CTCAAATCTC TACATGGAGA 
TAAATGATTT AAACC 

Seq ID NO: 230 Protein sequence: 
Protein Accession f) : NP_003686 



[jLLAA LAVATGPALT LRCHVCTSSS NCKHSWCPA SSRFCKTTNT VEPLRGNLVK 
3 YTLQGQVSSG TSSTQCCQED LCNEWjHNAA PTRTALAHSA LSLGLALSLL 
AVILAPSL 



85 



277 



WO 02/086443 

1 11 21 31 41 51 

I I I I I I 

CCGGGCAGGT GGCTCATGCT CGGGAGCGTG GTTGAGCGGC TGGCGCGGTT GTCCTGGAGC 
AGGGGCGCAG GAATTCTGAT GTGAAACTAA CAGTCTGTGA GCCCTGGAAC CTCCACTCAG 
AGAAGATGAA GGATAT CG AC ATAGGAAAAG AGTATATCAT CCCCAGTCCT GGGTATAGAA 
GTGTGAGGGA GAGAACCAGC ACTTCTGGGA CGCACAGAGA CCGTGAAGAT TCCAAGTTCA 
GGAGAACTCG ACCGTTGGAA TGCCAAGATG CCTTGGAAAC AGCAGCCCGA GCCGAGGGCC 
TCTCTCTTGA TGCCTCCATG CATTCTCAGC TCAGAATCCT GGATGAGGAG CATCCCAAGG 
GAAAGTACCA TCATGGCTTG AGTGCTCTGA AGCCCATCCG GACTAC7TCC AAACACCAGC 
ACCCAGTGGA CAATGCTGGG CTTTTTTCCT GTATGACTTT TTCGTGGCTT TCTTCTCTGG 
CCCGTGTGGC CCACAAGAAG GGGGAGCTCT CAATGGAAGA CGTGTGGTCT CTGTCCAAGC 
ACGAGTCTTC TGACGTGAAC TGCAGAAGAC TAGAGAGACT GTGGCAAGAA GAGCTGAATG 
AAGTTGGGCC AGACGCTGCT TCCCTGCGAA GGGTTGTGTG GATCTTCTGC CGCACCAGGC 
TCATCCTGTC CATCGTGTGC CTGATGATCA CGCAGCTGGC TGGCTTCAGT GGACCAAA7T 
TTCAGGATGG CTGTATTCTG CGGTCAGAAT GAGAGAGTCA AGCTGGGCAG AATCTCTCGC 
CAAGAGTTCA GCCTTCCTTT GGAGACTGCT CCATCAGTGC CGAGGTGTGT GGGAACAGGC 
TTCACTGCAC CGCCATCTTA CTGAGTTGCT TCACGTGAGG AAAAGGGGGC TTTGGCCCTG 
TGACTCAGTT CCACATTTTG GATTGCATAC TGGAAAAGAA GCCAATCTTC TTGCTAGTAA 
ACCAGCAACC CGGCTGTATA CAGTGGTGAC CCAAGCAATG GATATAAACC TAAAAATCTG 
AGGGAGGGGA GAGGTGGAAT ACAGTAGTTC TTGGAATCTG AAGTCTCCTA TTTGATCAGG 



TCTGGTATTA ATTTAATCTC AGGAAAAACA AGAAATTAAC CCAGAGAGAG T 

GGAATTCAGC GTAGCTACCT CCAGACCGTG GTGTCTGGCC TCCATTTTTG TCTGTCATTC 12S0 

AGCTCTGACT TACAGCTGCA GTCACCTTTG CTATAAGGCA CCTGGGTAGA AGGGTGGATG 1320 

GGCTTCACAT CAATTTTTTT CTTCCTTTAG GGTGGGGGAT TGGTTTGGCT TTCTTTTGTT 13 80 

GTGGTTTTTT GTTTTATTTT TGTCAAGATT GATTTTTAGA TGCAAGGACT TGAAAAGACC 1440 

CAGAAGGATC CCACCAGTTT TTCCTTGAGG CCTAGGATTT TTTATTCTGT CCCGAGCAGA 1500 

GGTAATTCCT CACAACTTAG TGCACCAGTA GCACCAGCCA T7TTGAGCAG AGTACCTCTT 15S0 

TGGGGAGCTT TTCGTTTTGT TTTGTTTTTA ATTCTCTTTC CTTAGCAGCA AGGTCTTTTT 1S20 

TCCTAGAGAA TCTACTCCGT TGCAGAATCA TTGCAACCTC AGGAGCCCTC ACTGATTGAG 1680 

TGCTGTCAGC CTGATATACT ACTTTGGACT CTGGAAACAG ATATGGGTTC TATTCTCTAT 1740 

TTCTACTGTG TGTCGTTAAA CAACCGTCGG AGACCAGATG ACCTGTTAGA TGGCTAGTCC 1800 

TGTATAACTC GACTCTGTAT GTTTCAATGT ATGTTACTGC AATGCTTCAC CTGCTGTACA 18S0 
GTGTTTGTGA GATGCTCTTT GAAGATGGTA CTTTTATATT T 



I I I I 

' ' - - GTTTTACTTA TTGAGAGTGT 

TTTACTCCCC AAATTATTCA 
TGTTTCTTAG ATCGTAGTCA TTGAGAAGTC CCAATAACTC T 
TAGTAAACTT CTCTTTCATC TTTGTGTTAG CTCTGTAGTC T 
TTTGTTTCCA AAGTCACAAT TGAATTATTC TTAGATACCT TAAGCCACTG A 
TGTTTGACTG AAAGCAAAAC AACGTGACAG TTTATTTTCA AACACTAACT TCTTGATATT 
TTGTTATGGT ATATCTTTTT ATTAAATATT TATTTTGACT AAGCTTTCAT AAAATATTTG 
AAGCTATTTT AATCATCAAG TATGGAAAAC AAATTACTAT TGCATTTTCC TATATATGCA 
TATATTATGG ATTAACCAGA ATTGTATCAT TTTTGGCCTA ATGTCTGGAT A 



A AAAGTAGTTA CAAATTAAAC TTACTAATTT ATACCTC 
AATTAAAGTA CATTTTAAAT GAGCTTTATA ATACCTTAAA AAGTTGGTTC TAATTTAAAA 
TATGAAAGCT CTGGCTATCA TCCTGGGATA GTAATTTCTA ATTATATAGT ATTTCAAAAC 
TATATATTTT TTAGTTCCTT TGAGATAACT AATTTCTAAT TATATAT 
TATCCTGTAT TTTTTTTAAG AATTGTTTTA TAAATAGGTC ATAAGATACA A 
TAGAAGACCC ACTCTTACTA GGTTCCCTAA GGATCTGCCA TAGATT1 
TTTTTTTTAG GTAGTTTAAA GCAAGCACTG ATACCAGTGG GAGTTGGTCT T 
GATTCTGTTA AGCATCCAAA AACAATGCCT AATTTCAGTT CTTAGGTTAT GGCTTGTGAC 
TCCAGATAAA AGATGGAGAA TACCTCATGT ACTGTGACTT GAAAATGAAT TCTTAAAATT 
CTTAGGCTCT CTCCATGTAT CTTTCTTAAG GAAAAGTTTC TGAGTGTGAT C 
CCATAGTATC AAGTGGAGGG TAGTTCAGAA AAGTTAATAG GAAATCTTTT G 
ACTATAATAG AAGTTTGAGT AATATTTTAA TAAATTTATA TAATTCAAAT GATAAAAATG 1320 
TATCAATGTT ATCCAATGAT TTTTATTAAA AAATTACCTT ATTATTAGAA CTG7GCCTAT 1380 
TACATAAAAA GTGCTCATGT ATTTGAATTT TAAATAA7TT ATTTAAATCA AGACCACCAT 1440 
AAGTCATTAA TAATTTAATA ATTGTTTTAA ATCAGTGGTT TTCAACCCTC ACTTCATATT 1500 
AGAATCATCT GAGGACTTTT AATATGGAAT CCACCTCATA ACAATTAAGT CTAAATTTCT 15S0 
GGAAGATGGA GCCATGCTTG TTTTTCCAAA AGCTCTTTGA GTGATTCTAA TTTGTAGTCA 1620 
GAGTTGAAGA CCACTGCTCT AAATTAGTGC AGGAAAATGC TTTTATTTCT CCCATGTTAA 1680 
CTTTTAAAAC TAGTAATGTA CCCAGTTAAG TTTTGATGGT TTAAATTCCA CTAAAGAACA 1740 
TATTCTTCTA ATAACTAGCA TTTATTACAT GAAATTTAAG AGTTTAAGTT CCATCAAACT 1800 
AGCCCTTGTG TAAGATTATT ATTTCTTCTC TATAACTTCA AAATAGATAT TTCATTCAAA 1860 
CTGTTCAGGT GAGAAAACAT AATGGATTTT TTTTTTTTTC CTCTGGAGCT GCCTGTTCAG 1920 
TGAGATGGAG GAGGTGGGCA CATTTAAGGT CAGTTCACTA ACCTATGGTT CAGAGTTCTG 1980 
ATCATATGGA AGTTTGGAAA AGAGAGCTTA TCACAGGTTT GTATGCTGGT GAATGGATAG 2040 
TTTTAATTCT CACTGTCTCA AAAGAGAATC AGCTCTCCAG CAGTTCTAGA AAAGCTTTGA 2100 
CAATCCCCAA GGGGCAGTGT TACCTTACTC CTTCACTGCT TCITAGAAGG TAGAATTAAG 2160 
TTTCTGGAAT TGCACCTACA TGTTTTCTTA TTAACATTCA GAATTGGGAA TATTAATTTT 2220 
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A AATTGGTAAC T 1 



AAAACCTCAT GCCTTTTCAT 



CTTCAGAAAT CCATATATTT 
AGATGGTATT TAAAATGAAT 
TTGGAAGCCG CCAGCCATTC 
T ACTATGGTAT 
A CCATAACCTG 
AT AGAGATT C TTCTTTTATG 
AAGACATTAA CATAAGTCTC 
CCACATTAAA CAACCACGGC 
TGTGCCTGGT ATCGCCTCTG 
CCTTCATCAA GCACTTGCCA 



AAGTATTAAG 



AAGAAGAGCT 
TGAGCAGTGA 
AACACT CAGA 
GCATAACTTA 



TTTTTTTAGA AACCTCCTAA 

TTTGTCCAAA AGTTTATCTG 
AATAATTTAA AATTGTATGC 
TATTCATTAT TAAATTGGTA 
CACAAAATCC AACATTGTAT 
ATTACCAGTG CATCTGCACA 
TACATTTTCA AACATGAAGA GTGACAACCA 
CCATCCTATA 



: AAAACCACTA 
CACACAACCA ACACACCACG 
GACAACACAT CACATACACT 



CTATCAACTG 
AATCATAACC 
ACCAAACACC 
CACTACCCCC 



CTCTAACTTG TACAACCTTA CCAAC7CACC 
CCAACCTAAA GACCCCCAAC ACAACACAAC 
ACCACACACG CCACACACCA CACACCCACC 
CCACCACAAA CAAGCTAACA ACCACAAACA 
CCATACTCCC ACCCACCA 



Seq ID NOi 234 DNA sequence 

Nucleic Acid Accession it: Eos sequence 

Coding sequence: 27-281 



GTCTACCGCG TAGCAGTTAC 
TTCCTGCCCC 
GGCGCTCACT 
CAGGGTTCCG 
GAGTTTTTCT 
CAGAAAGAAT 
ATGTTGCAGG 
TTTTCTTCTC 



GAAACAGTGT 
ACCAATCCAA 
TTGCTCTGAT 
GCAAGGAGAT 
AAGAGCTAGT 
CACTTGGCAT 



GAAGACATGC 
ATCAGACTGA 
GGGGCTGACG 
GTTGCTCCAC 
GAGCCTTGCA 
CTTGGAGACA 
AGACCAACGT 
CTTTCAGGCT 
ATCAAGAGCC 



I 

AGCCCTTGAA 
GACACTTCCT 
CCATTTTAGG 
AGCGCCTTGT 
GAAAGCATTA 
TCCCTCTGCC 
GAGATTCTCC 



GACCCAGAGA 
GTTTACAGGA 
CCTCAGCCCA 



ACGTGCTTTT 



GATGCCAGGA 
TCAAGCCAAG 
CTCAAACCCG 



AGACAGCCTG 



CCAACAGGTG 

TGACGTTTCA 

TAACAGCAGC 
CCTGTCACTG 



CAGGGGTAAA 
TTCTGTTTTT 
CTCTAGAACC 
GATGGCTTGG 
AAAGCAAAAG 
TGACACGATC 



AAGTGGTAAA 



TTCATGCACT 
CCTGAGAAAG 
GACTAAAACA 
GTAGCAAAAA 



GAGGCCGTCT 
GACTATAAAA 
TCTGCACCCA 
GGCGCGCTCT 
CTCTTTGGCA 
ATAAGGAATA 
CAAGAGAAAG 
AATGTCCAGC 
GGAAATGTTT 



TCCCCTACCA 



AACTGAATTC 
TATTAGTAAT 



CAACTCTGAG 
ATTTCCAGTA 
ACCACAGATT 
TACATTGTAA 



AGGGAAACAG 
GTTTGATTAA 
AGTATGCCAG 
TACATGCATT 



ATTCTCATTT T 
CACTGTCTCG C 
GAAAAGAAAA T 



GATTAAAACA 
AAATCTCAAG 
AAAAAAAAGC 
G I"A _"I T( :t( t 
TCCCATCAAA 
ATAATCCC 



Seq ID NO: 236 DNA sequence 
Nucleic Acid Accession #: NM_002 075 
406. .1428 



AGGAGAATAA 



CTAATAAGTG G 
ATCCATATCC C 
CGTCCATTAC ATCCAAAGGA 840 
GCCAGTGAAG C 
ACTTAAAAAA A 
TGAGAACAAT TAAGAAAAAA 
AGAGGAAGAA G 
AGAAATGACC T 
TTCGAATTAA T 
ACGATGATTA A 
ATCAAAACAA G 
TTGCAAATAC A 
CATTTTTTAA ATTTGAAAAA 
TGCAACAACA AAAAAGGTAT 
ATTCTCACGA CTACCTTTGA 
ACATTAAAAA 



C 7TCAACCACA 
h TGATTTACTT 
A ACATCTGAAA 
A GGTTCTCAGT 



CCACAATAGG 
ACAGGAT CAG 
AGTCCTTTCT 



GGCCAGGCCA 
CGTCGCAGCT 
CCAGCCAGAG 



GGCAGACCTG TCCATCCTTC 
AC CC AG AGGC 

GCTGAGGAGC 

GAGGGAGTAA 



ACCACCAACA 



CTCAAATCCC 



AGGTGCACGC 
GGAACTTTGT 
GTGAGGGCAA 



CCTGCCTGTA 
ACAGTTTGAG 
TCTGGCAGCA 
GGAGGCTCCC 
AGAGTGACCC 
GCAGCTCAAG 
GCTGGTGTCT 
GGGACACCTG 
TGCCTCGCAA 
CATCCCACTG 
GGCATGTGGG 
TGTCAAGGTC 

A GACTGGGCAG 
I GTCTCCTGAC 
A TGTGCGAGAG 



TCTGTGGGTC CCCTGTACCT TTCTCCCCCA 
GGTTTGTCGA GAAGAAGGAT TATCCAGATC 
CCCTCCCATA CTCACCAAAC CCTCTTCCCC 
GCCCCCCCAA CCCCCCGCCG GTCGGGGCCA 
GAGCCTGGGC AGGTGACGGG CGGGCGCGGG 



CTCGACCTGT 
AAGCAGATTG 
GGCCTAGAGG 
GCCAAGATTT 
GATGGGAAGC 
CGCTCCTCCT 
GGGCTGGACA 
AGCCGGGAGC 
AATATTGTGA 
CAGAAGACTG 
TTCAATCTCT 
GGGACCTGCC 



CAGCCATGGG 



TGGTGGGACG 



GGAGATGGAG 
GAAAGCCTGT 
AGTCCAGATG 



TGATCGTGTG 
GGGTCATGAC 
ACATGTGTTC 
TTTCTGCTCA 
CCAGCTCGGG 
TATTTGTGGG 
TCATTTCGGG 
GTCAGACTTT 



GGACAGCTAC 
CTGTGCCTAT 
CATCTACAAC 
CACAGGTTAT 
GGACACCACG 
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GAGTCGGACA TCAACGCCAT CTGTTTCTTC CCCAATGGAG AGGCCAICTG CACGGGCTCG 1140 

GATGACGCTT CCTGCCGCTT GTTTGACCTG CGGGCAGACC AGGAGCTGAT CTGCTTCTCC 1200 

CACGAGAGCA TCATCTGCGG CATCACGTCC GTGGCCTTCT CCCTCAGTGG CCGCCTACTA 1260 

TTCGCTGGCT ACGACGACTT CAACTGCAAT GTCTGGGACT CCATGAAGTC TGAGCGTGTG 132 0 

GGCATCCTCT CTGGCCACGA TAACAGGGTG AGCTGCCTGG GAGTCACAGC TGACGGGATG 1380 

GCTGTGGCCA CAGGTTCCTG GGACAGCTTC CTCAAAATCT GGAACTGAGG AGGCTGGAGA 144 0 

AAGGGAAGTG GAAGGCAGTG AACACACTCA GCAGCCCCCT GCCCGACCCC ATCTCATTCA 1500 

GGTGTTCTCT TCTATATTCC GGGTGCCATT CCCACTAAGC TTTCTCCT7T GAGGGCAGTG 1560 

GGGAGCATGG GACTGTGCCT TTGGGAGGCA GCATCAGGGA CACAGGGGCA AAGAACTGCC 162 0 

CCATCTCCTC CCATGGCCTT CCCTCCCCAC AGTCCTCACA GCCTCICCCT TAATGAGCAA 1680 

GGACAACCTG CCCCTCCCCA GCCCTTTGCA GGCCCAGCAG ACTTGAGTCT GAGGCCCCAG 174 0 

GCCCTAGGAT TCCTCCCCCA GAGCCACTAC CTTTGTCCAG GCCTG3GTGG TATAGGGCC-T IS 00 

TTGGCCCTGT GACTATGGCT CTGGCACCAC TAGGGTCCTG GCCCTCTTCT TATTCATGCT 1860 

TTCTCCTTTT TCTACCTTTT TTTCTCTCCT AAGACACCTG CAATAAAGTG TAGCACCCTG 192 0 
GT 



| | I I I I 

MGEMEQLRQE AEQLKKQIAD ARKACADVTL AELVSGLEW GRVQMRTRRT LRGKLAKIYA 
MHWATDSKLL VSASQDGKLI VWDSYTTNKV HAIPLRSSWV MTCAYAPSGN FVACGGLDNM 
CSIYNLKSRE GNVKVSRELS AHTGYLSCCR FLDDNNIVTS SGDTTCALWD ISTGQQKTVF 
VGHTGDCMSL AVSPDFNLFI SGACDASAKL WDVREGTCRQ TFTGHESDIN AICFFPNGEA 
ICTGSDDASC RLFDLRADQE LICFSHESII CGITSVAFSL SGRLLFAGYD D~'~" " '- ■ — 
KSERVGILSG HDNRVSCLGV TADGMAVATG SWDSFLKIWN 



I I I I 

. ... 3 TNGAACCTAC CATAAATTCT TTTCTTACNG GACAATCTTA TM CTAANCAA 

TACC i TTGC TTTTAAGGCA GATAATCCTC CAAGTTTTCT AATGATATCT GAAACTATTA 
ACTGATTCTG TGAATTATGA AATCTGAAAA GGAATTGGAA GTTGCTAAAA ATCTATCATT 
TGCATTGACC AGTGTGAAGC ACAGTGGAAT GAGAATGCGT GCCCTGACAC CAAAGAAAAA 
TAAGTGACTG GAAAGCTGAA GAATCACCGG CTTCAGTGAC ATGGAACCCA GTGATTTGAT 
TTTTGACGAG TATCGGGTGA CTTTGAGGTG GTCAAGAAAC CACACTTTAA GAACAATGTC 
CAAAAAGGGG AAAAAAAAGA GCAACCAAAG AAAAAAAATC CATAAAATTG CACAGAAGAA 
AAGAAAGAAA AATAAAATAC ACAATATGGA CGATGGAGAA AAACAGTTAC ATTTCTTTAT 
GGATCAAGAA GTTTGTGTAC ACATAATCTC ATTTTGAGAT A 
CAGAAGTCAA TCAAAATATT TCAAAATGCT G 
TAGAAAAGTT TTTCTGTAAA AGTCAGATAG T. 

CAACTACTCA ACTTTCCTAC TGTAGCACAA GAGTAGCTGT GGTACTGTGC A 
CTTGTGTTCC AATAAAGCTT CATTTACAAA AACATGCCAT GGGCCATATT TGGCCTGTAC 780 
ACTGTTGTTT GCCAAGTCCT AATATAGTTC CTTAGCAAGT ATTG7GAGCT ATTTGAGGAA 84 0 
GACATGAAAG TTCATTGGGT TGCTAAAAAG TATGTAGAAA TTCAAAGGAA AATTAAAATT 900 
TAGGCTAAGT TATAATACAC TGTTTTAACA ATTGTAAAAT GTAAGAGAAA TTTACAAATA 960 
AAAATCCCAA ATAAAA 

Seq ID NOi 239 DNA 



1 11 21 31 41 . 51 

I I I I I 
GGGGGGGGGG GGCACTTGGC TTCAAAGCTG GCTCTTGGAA ATTGAGCGGA GAGCGACGCG 
GTTGTTGTAG CTGCCGCTGC GGCCGCCGCG GAATAATAAG CCGGGATCTA CCATACCCAT 
TGACTAACTA TGGAAGATTA TACCAAAATA GAGAAAATTG GAGAAGGTAC CTATGGAGTT 
GTGTATAAGG GTAGACACAA AACTACAGGT CAAGTGGTAG CCATGAAAAA AATCAGACTA 
GAAAGTGAAG AGGAAGGGGT TCCTAGTACT GCAATTCGGG AAATTTCTCT ATTAAAGGAA 
CTTCGTCATC CAAATATAGT CAGTCTTCAG GATGTGC1TA TGCAGGATTC CAGGTTATAT 
CTCATCTTTG AGTTTCTTTC CATGGATCTG AAGAAATACT TGGATTCTAT CCCTCCTGGT 
CAGTACATGG ATTCTTCACT TGTTAAGAGT TATTTATACC AAATCCTACA GGGGATTGTG 
TTTTGTCACT CTAGAAGAGT TCTTCACAGA GACTTAAAAC CTCAAAATCT CTTGATTGAT 
TTGCCA GAGCTTTTGG AATACCTATC 



AGAGTATATA CACATGAGGT AGTAACACTC TGGTACAGAT CTCCAGAAGT ATTGCTGGGG 660 

TCAGCTCGTT ACTCAACTCC AGTTGACATT TGGAGTATAG GCACCATATT TGCTGAACTA 720 

70 GCAACTAAGA AACCACTTTT CCATGGGGAT TCAGAAATTG ATCAACTCTT CAGGATTTTC 780 

AGAGCTTTGG GCACTCCCAA TAATGAAGTG TGGCCAGAAG TGGAATCTTT ACAGGACTAT 84 0 

AAGAATACAT TTCCCAAATG GAAACCAGGA AGCCTAGCAT CCCATGTCAA AAACTTGGAT 900 

GAAAATGGCT TGGATTTGCT CTCGAAAATG TTAATCTATG ATCCAGCCAA ACGAATTTCT 960 

GGCAAAATGG CACTGAATCA TCCATATTTT AATGATTTGG ACAATCAGAI TAAGAAGATG 1020 

75 TAGCTTTCTG ACAAAAAGTT TCCATATGTT ATGTCAACAG ATAGTTGTGT TTTTATTGTT 1080 

AACTCTTGTC TATTTTTGTC TTATATATAT TTCTTTGTTA TCAAACTTCA GCTGTACTTC 1140 

GTCTTCTAAT TTCAAAAATA TAACTTAAAA ATGTAAATAT TCTAIATGAA TTIAAATATA 12 00 
ATTCTGTAAA TGTGAAAAAA AAAAAAAAAA AAAAA 

80 

Seq ID NO: 240 Protein sequence: 
Protein Accession #: NP_001777.1 

85 



MEDYTKIEKI GEGTYGWYK GRHKTTGQW AMKKIRLESE EEGVPSTAIR 3ISLLKELRH 50 
PNIVSLQDVL MQDSRLYLIF EFLSMDLKKY LDSIPPGQYM DSSLVKSYLY QILQGIVFCH 12 0 
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30 
35 



Y THEWTLWYR SFEVLLGSAR 
I DQLFRIFRAL GTPNNEVWPE VESLQDYKNT 
FPKWKPGSLA SHVKNLDENG LDLLSKMLIY DPAKRISGKM ALNHPYFNDL DNQIKKM 



I I -I I I I 

CGCCCGCGCG CGGGCTCAAC TTTGTAGAGC GAGGGGCCAA CTTGGCAGAG CT 

GCTTTGCAGA GAGCGCCCTC CAGGGACTAT GCGTGCGGGG ACACGGGATC TACCCATACC 

ATTGACTAAC TATGGAAGAT TATACCAAAA TAGAGAAAAT TGGAGAAGGT ACCTATGGAG 

15 TTGTGTATAA GGGTAGACAC AAAACTACAG GTCAAGTGGT AGCCATGAAA AAAATCAGAC 

TAGAAAGTGA AGAGGAAGGG GTTCCTAGTA CTGCAATTCG GGAAATTTCT CTATTAAAGG 

AACTTCGTCA TCCAAATATA GTCAGTCTTC AGGATGTGCT TATGCAGGAT TCCAGGTTAT 

ATCTCATCTT TGAGTTTCTT TCCATGGATC TGAAGAAATA CTTGGATTCT ATCCCTCCTG 

GTCAGTACAT GGATTCTTCA CTTGTTAAGG TAGTAACACT CTGGTACAGA TCTCCAGAAG 

20 TATTGCTGGG GTCAGCTCGT TACTCAACTC CAGTTGACAT TTGGAGTATA GGCACCATAT 

TTGCTGAACT AGCAACTAAG AAACCACTTT TCCATGGGGA TTCAGAAATT GATCAACTCT 

TCAGGATTTT CAGAGCTTTG GGCACTCCCA ATAATGAAGT GTGGCCAGAA GTGGAATCTT 

TACAGGACTA TAAGAATACA TTTCCCAAAT GGAAACCAGG AAGCCTAGCA TCCCATGTCA 

AAAACTTGGA TGAAAATGGC TTGGATTTGC TCTCGAAAAT GTTAATCTAT GATCCAGCCA 

25 AACGAATTTC TGGCAAAATG GCACTGAATC ATCCATATTT TAATGATTTG GACAATCAGA 

TTAAGAAGAT GTAGCTTTCT GACAAAAAGT TTCCATATGT TATGTCAACA GATAGTTGTG 

TTTTTATTGT TAACTCTTGT CTATTTTTGT CTTATATATA TTTCTTTGTT ATCAAACTTC 
AGCTGTACTT CGTCTTCTAA TTTCAAAAAT ATAACTTAAA A 
ATTTAAATAT AATTCTGTAA ATGTGAAAAA AAAAAAAAAA A 



Seq I 



K GRHKTTGOW AMKKIRLESE EEGVPSTAIR EISLLKELRH 

PNIVSLQDVL MQDSRLYLIF EFLSMDLKKY LDSIPPGQYM DSSLVKWTL WYRSPEVLLG 

SARYSTPVDI WSIGTIPAEL ATKKPLPHGD SEIDQLFRIF RALGTPNNEV WPEVSSLQDY 

KNTFPKWKPG SLASHVKNLD ENGLDLLSKM LIYDPAKRIS GKMALNHPYP NDLDNQIKKM 

Nucleic Acid Accession^*: AF1010S1.1 



I I I I 

GAGCAACCTC AGCTTCIAGT ATCCAGACTC CAGCGCCGCC C 
CGACCCAGAG CTTCTCCAGC GGCGGCGCAG CGAGCAGGGC T 

GCGGGGCCCA GCCACCTTCG GGAGTCCGGG TTGCCCACCT GCAAACTCTC CGCCTTCTGC 18C 

ACCTGCCACC CCTGAGCCAG CGCGGGCGCC CGAGCGAGTC ATGGCCAACG CGGGGCTGCA 240 

GCTGTTGGGC TTCATTCTCG CCTTCCTGGG ATGGATCGGC GCCATCGTCA GCACTGCCCT 300 

GCCCCAGTGG AGGATTTACT CCTATGCCGG CGACAACATC GTGACCGCCC AGGCCATGTA 360 

CGAGGGGCTG TGGATGTCCT GCGTGTCGCA GAGCACCGGG CAGATCCAGT GCAAAGTCTT 420 

TGACTCCTTG CTGAATCTGA GCAGCACATT GCAAGCAACC CGTGCCTTGA TGGTGGTTGG 480 

CATCCTCCTG GGAGTGATAG CAATCTTTGT GGCCACCGTT GGCATGAAGT GTATGAAGTG 540 

CTTGGAAGAC GATGAGGTGC AGAAGATGAG GATGGCTGTC ATTGGGGGTG CGATATTTCT 600 

TCTTGCAGGT CTGGCTATTT TAGTTGCCAC AGCATGGTAT GGCAATAGAA TCGTTCAAGA 660 

ATTCTATGAC CCTATGACCC CAGTCAATGC CAGGTACGAA TTTGGTCAGG CTCTCTTCAC 72 0 

TGGCTGGGCT GCTGCTTCTC TCTGCCTTCT GGGAGGTGCC CTACTTTGCT GTTCCTGTCC 780 

CCGAAAAACA ACCTCTTACC CAACACCAAG GCCCTATCCA AAACCTGCAC CTTCCAGCGG 840 

GAAAGACTAC GTGTGACACA GAGG CAAAAG GAGAAAATCA TGTTGAAACA AACCGAAAAT 900 

GGACATTGAG ATACTAT CAT TAACATTAGG ACCTTAGAAT TTTGGGTATT GTAATCTGAA 960 

GTAIGGTATT ACAAAACAAA CAAACAAACA AAAAACCCAT GTGTTAAAAT ACTCAGTGCT 1020 

AAACATGGCT TAATCTTATT TTATCTTCTT TCCTCAATAT AGGAGGGAAG ATTTTACCAT 1080 

TTGTATTACT GCTTCCCATT GAGTAATCAT ACTCAAATGG GGGAAGGGGT GCTCCITAAA 1140 

TATATATAGA TATGTATATA TACATGTTTT TCTATTAAAA ATAGACAGTA AAATACTATT 1200 

CTCATTATGT TGATACTAGC ATACTTAAAA TATCTCTAAA ATAGGTAAAT GTATTTAATT 1260 

CCATATTGAT GAAGATGTTT ATTGGTATAT TTTCTTTTTC GTCCTTATAT ACATATGTAA 1320 

CAGTCAAATA TCATTTACTC TTCTTCATTA GCTTTGGGTG CCTTTGCCAC AAGACCTAGC 1380 

CTAATTTACC AAGGATGAAT TCTTTCAATT CTTCATGCGT GCCCTTTTCA TATACTTATT 1440 

TTATTTTTTA CCATAATCTT ATAGCACTTG CATCGTTATT AAGCCCTTAT TTGTTTTGTG 1500 

TTTCATTGGT CTCTATCTCC TGAATCTAAC ACATTTCATA GCCTACATTT TAGTTTCTAA 1560 

AGCCAAGAAG AATTTAT T AC AAATCAGAAC TTTGGAGGCA AATCTTTCTG CATGACCAAA 1620 

T CCTGTTGACC TTCCCACACA ATCCCTGTAC TCTGACCCAT AC-CACICTTG 1680 

ITTGAGT AGCTGCATGC TGTTCCCCCA GGTGTIGTAA 1740 

CACAACTTTA TTGATTGAAT TTTTAAGCTA CTTATTCATA GTTTTATATC CCCCTAAACT 1800 

ACCTTTTTGT TCCCCATTCC TTAATTGTAT TGTTTTCCCA AGTGTAATTA TCATGCGTTT 1860 

TATATCTTCC TAATAAGGTG TGGTCTGTTT GTCTGAACAA AGTGCTAGAC TTTCTGGAGT 192 0 

GATAATCTGG TGACAAATAT TCTCTCTGTA GCTGTAAGCA AGTCACTTAA TCTTTCTACC 1980 

A TTGAGATAAT GATACTTAAC CAGTTAGAAG AGGTAGTGTG 2 040 

:tcattc TTTGAACATG AACTATGCCT ATGTAGTGTC 2100 

TTTATTTGCT CAGCTGGCTG AGACACTGAA GAAGTCACTG AACAAAACCT ACACACGTAC 2160 

CTTCATGTGA TTCACTGCCT TCCTCTCTCT ACCAGTCTAT TTCCACTGAA CAAAACCTAC 2220 

ACACATACCT TCATGTGGTT CAGTGCCTTC CTCTCTCTAC CAGTCTATTT CCACTGAACA 2280 

AAACCTACGC ACATACCTTC ATGTGGCTCA GTGCCTTCCT CTCTCTACCA GTCTATTTCC 2340 

ATTCTTTCAG CTGTGTCTGA CATGTTTGTG CTCTGTTCCA TTTTAACAAC TGCTCTTACT 2400 

TTTCCAGTCT GTACAGAATG CTATTTCACT TGAGCAAGAT GAIGTATGGA AAGGGTGTTG 2460 
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GCACTGGTGT CTGGAGACCT 
AGCAAGGCAT TTGGCTGCTG 
CTGATCTTCC CACCTCACAG 
GTGGTTTTGT AATTTGAAAA 
CGTTTTGGTG TTGCTTTTCA 
GCCTTAACCA GTCTCTCAAG 
AAGATTCTGA GGAAGTCTTA 
A TGGGAAGAAA 



GGATTTGAGT CTTGGTGCTA TCAATCACCG TCTGTGTTTG 
TAAGCTTATT GCTTCATCTG TAAGCGGTGG TTTGTAATTC 
TGATGTTGTG GGGATCCAGT GAGATAGAAT ACATGTAAC-T 
GTGCTATACT AAGGGAAAGA A 
AATAAAAAAA T 
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TCTTCTGCAG TGAGTATGGC 



10 
15 

20 
25 
30 
35 



TGTTAGCTGG CAGCIGACGC 
CTACACAAGG AAAGTCAGCC 
ACCTGAGAAT 



ACAAAAAAAT TTTATGGCCC 
TTTGATCTTT TTATATTCTT 
TTATAATGGG AATTTGTATA 
AAAAAAAAAA 



AATCCAACAG CAAGGGAGAT 

TCTGTTCAGT 

ACCGTGTCTT ATGAGGAATT 
ATATGCTTTT GGAAGTTAAA ATTTAAATGG CTTTTGCCAC 
TGTGAGTGTA ATTCCATGTG GATATCAGTT ACCAAACATT 
AAAATGACCA ACGAAATIGT TACAATAGAA TTTATCCAAT 
CTACCACACC TGGAAACAGA CCAATAGACA TTTTGGGGTT 



3300 
3360 
3420 



MANAGLQLLG FILAFLGWIG AIVSTALPQW RIYSYAGDNI VTAQAMYEGL WMSCVSQSTG 
QIQCKVPDSL LNLSSTLQAT RALMWGILL GVIAIFVATV GMKCMKCLED DEVQKMRMAV 
IGGAIFLLAG LAILVATAWY GNRIVQEFYD PMTPVNARYE FGOALFTGWA AASLCLLGGA 
LLCCSCPRKT TSYPTPRPYP KPAPSSGKDY V 



TTTTTTTTTT TTTTTTTTTT TTTTTCAAGG AGAGCACAAG GAACTTTATT AATGACTTTC 
TTAATGGTTA AATGCTGTTT ACCAAGTGAC CCAGAGGCAG CGTGGTTTAG TGGTTTCAAC 
AGCATGGTCC CGAGAGTCTG ACAAACCTCA GTTCAAATCC TTCTTTTGTC TTCACTTAGT 
TTTTCTTCCT G AGATT TAGT TTCTTCATCG TTAACAATGA GGATATTAAT ATGTTTCACA 
CAGTTGTTAT GAAGAATGCA TATATTAGAA TGCCTGTAGT CTCAGCTAOT CAGGAGGCTA 
AGGTGGGGAG GTCGCTCAAG CCCAGGAATT CAAAGCTGCA ATGCATTATG ATTACAGCTG 
A CTGCACTTCA GCCTGGGCAA TGTAGTAAGA TCCCATCTCT GGCTCGGAGG 
C CACGGAGTCT CGCTGATTGC TAGCACAGCA GTCTGAGATC AAACTGCA 



AATTTTCAGA 



55 
60 
65 



I 



TGGGATGAGA 



GTTGAGTGTA 
ACTGGTAACC 
TTATTTCTGT 

GCCTTTGCCT 



I 

AGTTTCGTAT GGGGATGGTT TTATATAAAT TCAGGTTTTT CCCACAATAA 
TAGTCTCAGT GCTCAATAGA AGAGATTTCT AATAGAAAAG GATTCAAACT 
TTCTCTTTTA ATGTTTCACA TTCCTGTTAC AGATTTGTTC TCTTGTGACT 
TAATATGGAC AGTTCTTGAG TCCTAACATT GAGAGGTTTT CCCTTAGTGC 
TGAGTATTAA TTGGAGAAGC TTAAAGTATT © 
GGAGGTGAAA CCTCACTAGA AAAAGGGACA A' 

ATGAAAATGG TGAACTAGTG TTTCCAAGCA TATTGGAAGG 



AACCCAGGAG G 



CTGATGTTGC 
CTGAAATTAG 
TCAACCAAAC 
CTTGCGATGA 



CCTGAGTAGC 
TGTTTGTTTG 
TTGTTGCCAG 
GAGTGCTAGG 
CACCGACTCC 
CAGGGCTTGC 
AAGCAAATTG 
TCATCATATC 
CAGGAGCCTT 
AGACTGGGAT 
CTACTCTGAC 
TTCAGGCATG 



TGGGACTACA 
TTTGTTTTTG 
GCTAGTCTCA 
ATTACAGCAC 
CTGGACCCTG 



CCCATGCCTG 



AACTCCTGGC 
TTGGATTCAG 
AGAAGCTATT 



GCTACTTGTC 
TCAAGCTGTG 
AGACAAGAGA 
AAAGATTTGT 
AACAACAGCC 
CGAGTTCCCA 
CTGAATACCT 
GTTGCTTCTT CTTCTACCAG TGGGTTCTCA 

TTTAATGCAA 



CCTTCAATGC 
ATGACAGAAG 
CTCTGGCTGA 
GGGAGCAGAC 
CTGCGAGCAA 
AATCTCTGCC 



TTCAAGTGAT 
CTTCTTCATT 
GCAATGCCCC 
CAAGTGCAGA 



TAATAGAACA 
TCTCACTCCA 
TGATCTGCCT 
GCTAAGTTTG 
TG TAG AGACG 
CCTCCTGCCT 
TCCAACATGG 
TATGACAAAA 



TTGTATTGAG 
GAGCACTTGG 
CAGCACCCCA 
CATAGTTACA 
GTATGTTCTG 



GACTTTCCAA ACTGACAAGC A 
A GAACCCTCAT 



AATCTAATTA 
CCTCCCCCCT 
ACCTTTGATA 



GAACATAAGA 
CCATGGAAAA 
CTAGAAGACT 
TAGAATGGTA 
TGAATCCTCA 
CAGATTG 



1020 
1080 
1140 
1200 
1260 
1320 
1380 



MEETYTDSLD PEKLLQCPYD KNHQIRACRF PYHLIKCRKN HPDVASKLAT CPFKARHQVP 
RAEISHHISS CDDRSCIEQD WNQTRSLRQ ETLAESTWQC PPCDEDWDKD LNEQTSTPFV 
WGTTHYSDNN SPASNIVTEH KNNLASGMRV PKSLPYVLPW KHNGNAQ 
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TTAAGGAAAT CCGGGCTGCT CTTCCCCATC TGGAAGTGGC TTTCCCCACA TCGGCTCGTA GO 

AACTGATTAT GAAACATACG ATGTTAATTC GGAGCTGCAT TTCCCAGCIG GGCACTCTCG 12 0 

CGCGCTGGTC CCCGGGGCCT CGCCCCCCAC CCCCIGCCCT TCCCTCCCCC GTCCTGCCCC 180 

CATCCTCCAC CCCCCGCGCT GGCCACCCCG CCTCCTTGGC AGCCTCTGC-C GGCAGCGCGC 2 40 

TCCACTCGCC TCCCGTGCTC CTCTCGCCCA TGGAATTAAT TCTGGCTCCA CTTGTTGCTC 3 00 

GGCCCAGGTT GGGGAGAGGA CGGAGGGTGG CCGCAGCGGG TTCCTGAGTG AATTACCCAG 3 SO 

GAGGGACTGA GCACAGCACC AACTAGAGAG GGGTCAGGGG GTGCGGGACT CGAGCGAGCA 420 

GGAAGGAGGC AGCGCCTGGC ACCAGGGCTT TGACTCAACA GAATTGAGAC ACGTTTGTAA 480 

TCGCTGGCGT GCCCCGCGCA CAGGATCCCA GCGAAAATCA GATTTCCTGG TGAGGTTGCG 540 

TGGGTGGATT AATTTGGAAA AAGAAACTGC CTATATCTTG CCATCAAAAA ACTCACGGAG 6 00 

GAGAAGCGCA GTCAATCAAC AGTAAACTTA AGAGACCCCC GATGCTCCCC TGGTTTAACT SSO 

TGTATGCTTG AAAATTATCT GAGAGGGAAT AAACATCTTT TCCTTCTTCC CTCTCCAGAA 72 0 

GTCCATTGGA ATATTAAGC C CAGGAGTTGC TTTGGGGATG GCTGGAAGTG CAATGTCTTC 780 

CAAGTTCTTC CTAGTGGCTT TGGCCATATT TTTCTCCTTC GCCCAGGTTG TAATTGAAGC 840 

CAATTCTTGG TGGTCGCTAG GTATGAATAA CCCTGTTCAG ATGTCAGAAG TATATATTAT 900 

AGGAGCACAG CCTCTCTGCA GCCAACTGGC AGGACTTTCT CAAGGACAGA AGAAACTGTG 960 

CCACTTGTAT CAGGACCACA TGCAGTACAT CGGAGAAGGC GCGAAGACAG GCATCAAAGA 1020 

ATGCCAGTAT CAATTCCGAC ATCGACGGTG GAACTGCAGC ACTGTGGATA ACACCTCTGT 1080 

TTTTGGCAGG GTGATGCAGA TAGGCAGOCG CGAGACGGCC TTCACATACG CCGTGAGCGC 1140 

AGCAGGGGTG GTGAACGCCA 1GAGCCGGGC GTGCCGCGAG GGCGAGCTGT CCACCTGCGG 12 0 0 

CTGCAGCCGC GCCGCGCGCC CCAAGGACCT GCCGCGGGAC TGGCTCTGGG GCGGCTGCGG 12 60 

CG A CAACATC GACTATGGCT ACCGCTTTGC CAAGGAGTTC GTGGACGCCC GCGAGCGGGA 1320 

GCGCATCCAC GCCAAGGGCT CCTACGAGAG TGCTCGCATC CTCATGAACC TGCACAACAA 1380 

CGAGGCCGGC CGCAGGACGG TGTACAACCT GGCTGATGTG GCCTGCAAGT GCCATGGGGT 1440 

GTCCGGCTCA TGTAGCCTGA AGACATGCTG GCTGCAGCTG GCAGACTTCC GCAAGGTGGG IS 00 

TGATGCCC1G AAGGAGAAGT ACGACAGCGC GGCGGCCATG CGGCTCAACA GCCGGGGCAA 1560 

GTTGGTACAG GTCAACAGCC GCTTCAACTC GCCCACCACA CAAGACCTGG TCTACATCGA 162 0 

CCCCAGCCCT GACTACTGCG TGCGCAATGA GAGCACCGGC TCGCTGGGCA CGCAGGGCCG 1680 

CCTGTGCAAC AAGACGTCGG AGGGCATGGA TGGCTGCGAG CTCATGTGCT GCGGCCGTGG 1740 

GTACGACCAG TTCAAGACCG TGCAGACGGA GCGCTGCCAC TGCAAGTTCC ACTGGTGCTG 1800 

CTACGTCAAG TGCAAGAAGT GCACGGAGAT CGTGGACCAG TTTGTGTGCA AGTAGTGGGT 1860 

GCCACCCAGC ACTCAGCCCC GCTCCCAGGA CCCGCTTATT TATAGAAAGT ACAGTGATTC 192 0 

TGGTTTTIGG TTTTTAGAAA TATTTTTTAT TTTTCCCCAA GAATTGCAAC CGGAACCATT 1980 

TTTTTTCCTG TTACCATCTA AGAACTCTGT GGTTTATTAT TAATATTATA ATTATTATTT 2040 

GGCAATAATG GGGGTGGGAA CCACGAAAAA TATTTATTTT GTGGATCTTT GAAAAGGTAA 2100 

TACAAGACTT CTTTTGGATA GTATAGAATG AAGGGGGAAA TAACACATAC CCTAACTTAG 2160 

CTGTGTGGGA CATGGTACAC ATCCAGAAGG TAAAGAAATA CATTTTCTTT TTCTCAAATA 2220 

TGCCATCATA TGGGATGGGT AGGTTCCAGT TGAAAGAGGG TGGTAGAAAT CTAITCACAA 22 8 0 

TTCAGCTTCT ATGACCAAAA TGAGTTGTAA ATTCTCTGGT GCAAGATAAA AGGTCTTGGG 23 40 

AAAACAAAAC AAAACAAAAC AAACCTCCCT TCCCCAGCAG GGCTGCTAGC TTGCTTTCTG 24 0 0 

CATTTTCAAA ATGATAATTT ACAATGGAAG GACAAGAATG TCATATTCTC AAGGAAAAAA 2460 

GGTATATCAC ATGTCTCATT CTCCTCAAAT ATTCCATTTG CAGACAGACC GTCATATTCT 2 52 0 

AATAGCTCAT GAAATTTGGG CAGCAGGGAG GAAAGTCCCC AGAAATTAAA AAATTTAAAA 258 0 

CTCTTATGTC AAGATGTTGA TTTGAAGCTG TTATAAGAAT TGGGATTCCA GATTTGTAAA 2640 

AAGACCCCCA ATGATTCTGG ACACTAGATT TTTTGTTTGG GGAGGTTGGC TTGAACATAA 2700 

ATGAAATATC CTGTATTTTC TTAGGGATAC TTGGTTAGTA AATTATAATA GTAGAAATAA 2 7 60 

TACATGAATC CCATTCACAG GTTTCTCAGC CCAAGCAACA AGGTAATTGC GTGCCATTCA 2 82 0 

GCACTGCACC AGAGCAGACA ACCTATTTGA GGAAAAACAG TGAAATCCAC CTTCCTCTTC 2 880 

ACACTGAGCC CTCTCTGATT CCTCCGTGTT GTGATGTGAT GCTGGCCACG TTTCCAAACG 2 940 

GCAGCTCCAC TGGGTCCCCT TTGGTTGTAG GACAGGAAAT GAAACATTAG GAGCTCTGCT 3000 

TGGAAAACAG TTCACTACTT AGGGATTTTT GTTTCCTAAA ACTTTTATTT TGAGGAGCAG 3 0 50 

TAGTTTTCTA TGTTTTAATG ACAGAACTTG GCTAATGGAA TTCACAGAGG TGTTGCAGCG 3120 

TATCACTGTT ATGATCCTGT GTTTAGATTA TCCACTCATG CTTCTCCTAT TGTACTGCAG 3180 

GTGTACCTTA AAACTGTTCC CAGTGTACTT GAACAGTTGC ATTTATAAGG GGGGAAATGT 3240 

GGTTTAATGG TGCCIGATAT CTCAAAGTCT TTTGTACATA ACATATATAT ATATATACAT 3 3 00 

ATATATAAAT ATAAATATAA ATATATCTCA TTGCAGCCAG TGATTTAGAT TTACAGCTTA 3 3 SO 

CTCTGGGGTT ATCTCTCTGT CTAGAGCATT GTTGTCCTTC ACTGCAGTCC AGTTGGGATT 3420 

ATTCCAAAAG TTTTTTGAGT CTTGAGCTTG GGCTGTGGCC CCGCTGTGAT CATACCCTGA 3480 

GCACGACGAA GCAACCTCGT TTCTGAGGAA GAAGCTTGAG TTCTGACTCA CTGAAATGCG 3540 

TGTTGGGTTG AAGATATCTT TTTTTCTTTT CTGCCTCACC CCTTTGTCTC CAACCTCCAT 3600 

TTCTGTTCAC TTTGTGGAGA GGGCATTACT TGTTCGTTAT AGACATGGAC GTTAAGAGAT 3650 

ATTCAAAACT CAGAAGCATC AGCAATGTTT CTCTTTTCTT AGTTCATTCT GCAGAATGGA 372 0 

AACCCATGCC TATTAGAAAT GACAGTACTT ATTAATTGAG TCCCTAAGGA ATATTCAGCC 37 80 

CACTACATAG ATAGCTTTTT TTTTTTTTTT TTTTTTTTAA TAAGGACACC TCTTTCCAAA 3840 

CAGGCCATCA AATATGTTCT TATCTCAGAC TTACGTTGTT TTAAAAGTTT GGAAAGATAC 3 90 0 

ACATCTTTTC ATACCCCCCC TTAGGAGGTT GGGCTTTCAT ATCACCTCAG CCAACTGTGG 3 960 

CTCTTAATTT ATTGCATAAT GATATCCACA TCAGCCAACT GTGGCTCTTT AATTTATTGC 402 0 

ATAATGATAT TCACATCCCC TCAGTTGCAG TGAATTGTGA GCAAAAGATC TTGAAAGCAA 4080 

AAAGCACTAA TTAGTTTAAA ATGTCACTTT TTTGGTTTTT ATTATACAAA AACCATGAAG 4140 

TACTTTTTTT ATTTGCTAAA TCAGATTGTT CCTTTTTAGT GACTCATC-TT TATGAAGAGA 4200 

GTTGAGTTTA ACAATCCTAG CTTTTAAAAG AAACTATTTA ATGTAAAATA TTCTACATGT 4260 

CATTCAGATA TTATGTATAT CTTCTAGCCT TTATTCTGTA CTTTTAATGT ACATATTTCT 4320 
GTCTIGCGTG ATTTGTATAT TTCACTGGTT TAAAAAACAA A 
AATGGAAGAT AGAATATAAA ATAAAACGTT A 

Seq ID NO: 249 Protein seguer 
Protein Accession #: NP_0033S 
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MAGSAMSSKF FLVALAIFFS FAQWIEANS WWSLGMNNPV QMSEVYIIGA QPLCSQLAGL SC 

SQGQKKLCHL YQDHMQYIGE GAKTGIKECQ YQFRHRRWNC STVDKTSVFG RVMQIGSRET 120 

AFTYAVSAAG WNAMSRACR EGELSTCGCS RAARPKDLPR DWLWGGCGDN IDYGYRFAKE 180 

FVDARERERI HAKGSYESAR ILMNLHNNEA GRRTVYNIiAD VACKCHGVSG SCSLKTCWLQ 240 

LADFRKVGDA LKEKYDSAAA HRLNSRGKLV QVNSRFNSPT TQDLVYIDPS PDYCVRNEST 300 

GSLGTQGRLC NKTSEGMDGC ELMCCGRGYD QFKTVQTERC HCKFHWCOYV KCKKCTEIVD 360 

Seq ID 
Nucleic 

Coding sequence: 56.. 1324 

i ji i 1 I 1 i 1 r 

TGACTTGGAT GTAGACCTCG ACCTTCACAG GACTCTTCAT TGCTGGTTGG CAATGATGTA 60 

TCGGCCAGAT GTGGTGAGGG CTAGGAAAAG AGTTTGTTGG GAACCCTGGG TTATCGGCCT 120 

CGTCATCTTC ATATCCCTGA TTGTCCTGGC AGTGTGCATT GGACTCACTG TTCATTATGT 180 

GAGATATAAT CAAAAGAAGA CCTACAATTA CTATAGCACA TTGTCATTTA CAACTGACAA 240 

ACTATATGCT GAGTTTGG CA GAGAGGCTTC TAACAATTTT ACAGAAATGA GCCAGAGACT 300 

TGAATCAA1G GTGAAAAATG CATTTTATAA ATCTCCATTA AGGGAAGAAT TTGTCAAGTC 360 

TCAGGTTATC AAGTTCAGTC AACAGAAGCA 1GGAGTGTTG GCTCATATGC TGTTGATTTG 420 

TAGATTTCAC TCTACTGAGG ATCCTGAAAC TGTAGATAAA ATTGTTCAAC TTGTTTTACA 480 

TGAAAAGCTG CAAGATGCTG TAGGACCCCC TAAAGTAGAT CCTCACTCAG TTAAAATTAA 540 

AAAAATCAAC AAGACAGAAA CAGACAGCTA TCTAAACCAT TGCTGCGGAA CACGAAGAAG 600 

TAAAACTCTA GGTCAGAGTC TCAGGATCGT TGGTGGGACA GAAGTAGAAG AGGGTGAATG 560 

GCCCTGGCAG GCTAGCCTGC AGTGGGATGG GAGTCATCGC TGTGGAGCAA CCTTAATTAA 720 

TGCCACATGG CTTGTGAGTG CTGCTCACTG TTTTACAACA TATAAGAACC CTGCCAGATG 780 

GACTGCTTCC TTTGGAGTAA CAATAAAACC TTCGAAAATG AAACGGGGTC TCCGGAGAAT 840 

AATTGTCCAT GAAAAATACA AACACCCATC ACATGACTAT GATATTTCTC TTGCAGAGCT 90 0 

TTCTAGCCCT GTTCCCTACA CAAATGCAGT ACATAGAGTT TGTCTCCCTG ATGCATCCTA 960 

TGAGTTTCAA CCAGGTGATG TGATGTTTGT GACAGGATTT GGAGCACTGA AAAATGATGG 1020 

TTACAGTCAA AATCATCTTC GACAAGCACA GGTGACTCTC ATAGACGCTA CAACTTGCAA 1080 

TGAACCTCAA GCTTACAATG ACGCCATAAC TCCTAGAATG TTATGTGCTG GCTCCTTAGA 1140 

AGGAAAAACA GATGCATGCC AGGGTGACTC TGGAGGACCA CTGGTTAGTT CAGATGCTAG 1200 

AGATATCTGG TACCTTGCTG GAATAGTGAG CTGGGGAGAT GAATGTGCGA AACCCAACAA 1260 

GCCTGGTGTT TATACTAGAG TTACGGCCTT GCGGGACTGG ATTACTTCAA AAACTGGTAT 132 0 

CTAAGAGAGA AAAGCCTCAT GGAACAGATA ACATTTTTTT TTGTTTTTTG GGTGTGGAGG 1380 

CCATTTTTAG AGATACAGAA TTGGAGAAGA CTTGCAAAAC AGCTAGATTT GACTGATCTC 1440 
AATAAACTGT TTGCTTGATG CAAAAAAAAA A 

Seq ID NO: 251 Protein sequence: 

i i 1 r r i 1 i 1 

MYRPDVVRAR KRVCWEPWVI GLVIFISLIV LAVCIGLTVE YVRYNOKKTY NYYSTLSFTT 60 

DKLYAEFGRE ASNNFTEMSQ RLESMVKNAF YKSPLREEFV KSQVIKFSQQ KKGVLAHMLL 12 0 

ICRFHSTEDP ETVDKIVQLV LIIEKLQDAVG PPKVDPHSVK IKKINKTETD SYLNHCCGTR 180 

RSKTLGQSLR IVGGTEVEEG EWPWQASLQW DGSHRCGATL INATWLVSAA I-ICFTTYKNPA 24 0 

RWTASFGVTI KPSKMKRGLR RIIVHEKYKH PSHDYDISLA ELSSPVPYTN AVHRVCLPDA 300 

SYEFQPGDVM FVTGFGALKN DGYSQNHLRQ AQVTLIDATT CNEPQAYNDA ITPRMLCAGS 36 0 

LEGKTDACQG DSGGPLVSSD ARDIWYLAGI VSWGDECAKP NKPGVYTRVT ALRDWITSKT 420 
GI 



1 11 21 31 41 51 

I I I I I I 

GGCACGAGGC CTCGTGCCGC CGGGCTCTTG GTACCTCAGC GCGAGCGCCA GGCGTCCGGC 60 

CGCCGTGGCT ATGTTCGTGT CCGATTTCCG CAAAGAGTTC TACGAGGTGG TCCAGAGCCA 12 0 

GAGGGTCCTT CTCTTCGTGG CCTCGGACGT GGATGCTCTG TGTGCGTGCA AGATCCTTCA 180 

GGCCTTGTTC CAGTGTGACC ACGTGCAATA TACGCTGGTT CCAGTTTCTG GGTGGCAAGA 240 

ACTTGAAACT GCATTTCTTG AGCATAAAGA ACAGTTTCAT TAITTTATTC TCATAAACTG 300 

TGGAGCTAAT GTAGACCTAT TGGATAITCT TCAACCTGAT GAAGACACTA TATTCTTTGT 360 

GTGTGACACC CATAGGCCAG TCAATGTCGT CAATGTATAC AACGATACCC AGATCAAATT 42 0 

ACTCATTAAA CAAGATGATG ACCTTGAAGT TCCCGCCTAT GAAGACATCT TCAGGGATGA 480 

AGAGGAGGAT GAAGAGCATT CAGGAAATGA CAGTGATGGG TCAGAGCCTT CTGAGAAGCG 54 0 

CACACGGTTA GAAGAGGAGA TAGTGGAGCA AACCATGCGG AGGAGGCAGC GGCGAGAGTG 600 

GGAGGCCCGG AGAAGAGACA TCCTCTTTGA CTACGAGCAG TATGAAIATC ATGGGACATC 660 

GTCAGCCATG GTGATGTTTG AGCTGGCTTG GATGCTGTCC AAGGACCTGA ATGACATGCT 72 0 

GTGGTGGGCC ATCGTTGGAC TAACAGACCA GTGGGTGCAA GACAAGATCA CTCAAATGAA 780 

ATACGTGACT GATGTTGGTG TCCTGCAGCG CCACGTTTCC CGCCACAACC ACCGGAACGA 840 

GGATGAGGAG AACACACTCT CCGTGGACTG CACACGGATC TCCTTTGAGT ATGACCTCCG 900 

CCTGGTGCTC TACCAGCACT GGTCCCTCCA TGACAGCCTG TGCAACACCA C-CTAIACCGC 960 

AGCCAGGTTC AAGCTGTGGT CTGTGCATGG ACAGAAGCGG CTCCAGGAGT TCCTTGCAGA 102 0 

CATGGGTCTT CCCC1GAAGC AGGTGAAGCA GAAGTTCCAG GCCATGGACA TCTCCTTGAA 1080 

GGAGAATTTG CGGGAAATGA TTGAAGAGTC TGCAAATAAA TTTGGGATGA AGGACATGCG 114 0 

CGTGCAGACT TTCAGCATTC ATTTTGGGTT CAAGCACAAG TTTCTGGCCA GCGACGTGGT 12 00 

CTTTGCCACC ATGTCTTTGA TGGAGAGCCC CGAGAAGGAT GGCTCAGGGA CAGATCACTT 12 6 0 

CATCCAGGCT CTGGACAGCC TCTCCAGGAG TAACCTGGAC AAGCTGTACC ATGGCCTGGA 132 0 

ACTCGCCAAG AAGCAGCTGC GAGCCACCCA GCAGACCATT GCCAGCTGCC TTTGCACCAA 1380 

CCTCGTCATC TCCCAGGGGC CTTTCCTGTA CTGCTCTCTC ATC-GAGC-GCA CTCCAGATGT 1440 

CATGCTGTTC TCTAGGCCGG CATCCCTAAG CCTGCTCAGC AAACACCTGC TCAAGTCCTT 150 0 

TGTGTGTTCG ACAAAGAACC GGCGCTGCAA ACTGCTGCCC CTGGTGATGG CTGCCCCCCT 1560 
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GAGCATGGAG CATGGCACAG TGACCGTGGT GGGCATCCCC CCAGAGACCG ACAGCTCGGA 162 0 

CAGGAAGAAC TTTTTTGGGA GGGCGTTTGA GAAGGCAGCG GAAAGCACCA GCTCCCGGAT 1680 

GCTGCACAAC CATTTTGACC TCTCAGTAAT TGAGCTGAAA GCTGAGGATC GGAGCAAGTT 1740 

TCTGGACGCA CTTATTTCCC TCCTGTCCTA GGAATTTGAT TCTTCCAGAA TGACCTTCTT 1800 

5 ATTTATGTAA CTGGCTTTCA TTTAGATTGT AAGTTATGGA CATGATTTGA GATGTAGAAG 1860 

CCATTTTTTA 1TAAATAAAA TGCTTATTTT AGGCTCCGTC CCCAAAAAAA AAAAAAAAAA 192 0 



Seg ID NO: 253 Protein sequence: 
10 Protein Accession ft: NP_003495.1 

1 11 21 31 41 51 

I I I I I 

MFVSDFRKEF YEWQSQRVL LFVASDVDAL CACKILQALF QCDHVQYTLV PVSGWQELET 60 

15 AFLEHKEQFH YFILINCGAN VDLLDILQPD EDTIFFVCDT HRPVNWNVY NDTQIKLLIK 12 0 

QDDDLEVPAY EDIFRDEEED EEHSGNDSDG SEPSEKRTRL EEEIVSQTMR RRQRREWEAR 180 

RRDILFDYEQ YEYHGTSSAM VMFELAWMLS KDLNDMLWWA IVGLTDQWVQ DKITQMKYVT 240 

DVGVLQRHVS RHNHRNEDEE NTLSVDCTRI SFEYDLRLVL YQHWSLEDSL CNTSYTAARF 3 00 

KLWSVHGQKR LQEFLADMGL PLKQVKQKFQ AMDISLKENL REMIEESANK FGMKDMRVQT 360 

20 FSIHFGFKHK FLASDWFAT MSLMESPEKD GSGTDHFIQA LDSLSRSNLD KLYHGLSLAK 42 0 

KQLRATQQTI ASCLCTNLVI SQGPFLYCSL MEGTPDVMLF SRPASLSLL3 KHLLKSFVCS 480 

TKNRRCKLLP LVMAAPLSME HGTVTWGIP PETDSSDRKN FFGRAFEKAA ESTSSRMLHN 540 
HFDLSVIELK AEDRSKFLDA LISLLS 

25 Seq ID NO= 254 DNA sequence 

Nucleic Acid Accession #: NMJ322337 
Coding sequence: 48.. 683 



GGCTGCGCTT CCCTGGTCAG GCACGGCACG TCTGGCCGGC CGCCAGGATG CAGGCCCCGC 60 

ACAAGGAGCA CCTGTACAAG TTGCTGGTGA TTGGCGACCT GGGCGTGGGG AAGACCAGTA 12 0 

TCATCAAGCG CTACGTGCAC CAGAACTTCT CCTCGCACTA CCGGGCCACA ATCGGCGTGG 180 

ACTTCGCGCT CAAGGTGCTC CACTGGGACC CGGAGACTGT GGTGCGCCTG CAGCTCTGGG 240 

ATATCGCAGG TCAAGAAAGA TTTGGAAACA TGACGAGGGT CTATTACCGA GAAGCTATGG 300 

GTGCATTTAT TGTCTTCGAT GTCACCAGGC CAGCCACATT IGAAGCAGTG GCAAAGTGGA 3 60 

AAAATGATTT GGACTCCAAG TTAAGTCTCC CTAATGGCAA ACCGGTTTCA GTGGTTTTGT 420 

TGGCCAACAA ATGTGACCAG GGGAAGGATG TGCTCATGAA CAATGGCCTC AAGATGGACC 480 

AGTTCTGCAA GGAGCACGGT TTCGTAGGAT GGTTTGAAAC ATCAGCAAAG GAAAATATAA 540 

ACATTGATGA AGCCTCCAGA TGCCTGGTGA AACACATACT TGCAAATGAG TGTGACCTAA 600 

TGGAGTCTAT TGAGCCGGAC GTCGTGAAGC CCCATCTCAC ATCAACCAAG GTTGCCAGCT 6 SO 

GCTCTGGCTG TGCCAAATCC TAGTAGGCAC CTTTGCTGGT GTCTGGTAGG AATGACCTCA 72 0 

TTGTTCCACA AATTGTGCCT CTATITTTAC CATTTTGGGT AAACGTCAGG ATAGATATAC 7 80 

CACATGTGGC AAGCCAAAGA TCTATGCCTC TGTTTTTTCA ATGAGAGAGA AATAGCAAAT 84 C 

GTTCTTTCTA TGCTTTCCTC ACCATCATCA CAGTGTTTAC AAACTTTTGA AAATATTTAG 90C 

TCTGTTACAA ACTTCTGTCA IGTAGCTGAC CAAAATCCTG CAGGGCCACA GTCGGCACTG 96 C 

TTATTTGCTT CTTTTAATCA GCAAAGGCCT CAAGTCTTAA AATAAAAGGG GAGAAGAACA 102 C 

AACTAGCTGT CAAGTCAAGG ACTGGCTTTC ACCTTGCCCT GGTGTCTTTT TCCAGATTTC 10 8C 

AATATATTCT CTGATGGCCT GACAGGCCTA TTAAGTAGAT GTGATATTTT CTTCCAAGAT 1140 

GACCTCCATT CTCGGCAGAC CTAAGAGTTG CCTCTGAGTT AGCTCTTTGG AATCGTGAAC 1200 

ACAGGTGTGC TATATTGTCC TTGTCCTAAC TGTCACTTGC CATGGCCTGA ATGTTGGCTT 12 60 

AACTGAATAT TGTATGAAAA GACATGCCTC CATATGTGCC TTTCTGTTAG CTCTCTTTGA 1320 

CTCAAGCTGT GGGGCTCCTC TATACATGCT ATACATGTAA TATATATTAT ATATATTTTT 1380 
GCAAGTGAAC AATAAAACAT TAAAAGATAA AA 



MQAPHKEHLY KLLVIGDLGV GKTSIIKRYV HQNFSSHYRA TIGVDFALKV LHKDPETWR 
LQLWDIAGQE RFGNMTRVYY REAMGAFIVF DVTRPATFEA VAKWKNDLDS KLSLPNGKPV 
65 SWLLANKCD QGKDVLMNNG LKMDQFCKEH GFVGWFETSA KENIN IDEAS RCLVKHILAN 
ECDLMESIEP DWKPHLTST KVASCSGCAK S 

Seq ID NO: 256 DNA sequence 
Nucleic Acid Accession #: NM_016321 
70 Coding sequence: 25.. 1464 

1 11 21 31 41 51 

I I I I I 

GGAACCGCCC GCIGCCAGCC CGGCCAGGCA CCCCTGCAGC ATGGCCTGGA ACACCAACCT 

75 CCGCTGGCGG CTGCCGCTCA CCTGCCTGCT CCTGCAGGTG ATTATGGTGA TTCTCTTCGG 
GGTGTTCGTG CGCTACGACT TCGAGGCCGA CGCCCACTGG TGGTCAGAGA GGACGCACAA 
GAACTTGAGC GACATGGAGA ACGAATTCTA CTATCGCTAC CCAAGCTTCC AGGACGTGCA 
CGTGATGGTC TTCGTGGGCT TCGGCTTCCT CATGACTTTC CTGCAGCGCT ACC-GCTTCAG 
CGCCGTGGGC TTCAACTTCC TGTTGGCAGC CTTCGGCATC CAGTGGGCGC TGCTCATGCA 

80 GGGCIGGTTC CACTTCTTAC AAGACCGCTA CATCGTCGTG GGCGTGGAGA ACCTCATCAA 
CGCTGACTTC TGCGTGGCCT CTGTCTGCGT GGCCTTTGGG GCAGTTCTGG GIAAAGTCAG 
CCCCATTCAG CTGCTCATCA TGACTTTCTT CCAAGTGACC CTCTTCGCTG TGAATGAGTT 
CATTCTCCTT AACCTGCTAA AGGTGAAGGA TGCAGGAGGC TCCATGACCA TCCACACATI 
TGGCGCCTAC TTTGGGCTCA CAGTGACCCG GATCCTCTAC CGACGCAACC TAGAGCAGAG 

85 CAAGGAGAGA CAGAATTCTG TGTACCAGTC GGACCTCTTT GCCATGATTG GCACCCTCTT 
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ATCCAGTGCC CTGCACAAGA AGGGCAAGCT GGACATGGTG CACATCCAGA ATGCCACGCT 900 

CGCAGGAGGG GTGGCCGTGG GTACCGCTGC TGAGATGATG CTCATGCCTT ACGGTGCCCT 9S0 

CATCATCGGC TTCGTCTGCG GCATCATCTC CACCCTGGGT TTTGTATACC TGACCCCATT 102 0 

CCTGGAGTCC CGGCTGCACA TCCAGGACAC ATGTGGCATT AACAATCTGC ATGGCATTCC 1080 

A GGCGGCATCG TGGGTGCTGT GACAGCGGCC TCCGCCAGCC TTGAAGTCTA 1140 

A GGGCTTGTCC ATTCCTTTGA CTTTCAAGGT TTCAACGGGG ACTGGACCGC 12 00 

„ G GGAAAGTTCC AGATTTATGG TCTCTTGGTG ACCCTGGCCA TGGCCCTGAT 1260 

GGGTGGCATC ATTGTGGGGC TCATTT1GAG ATTACCATTC TGGGGACAAC CTTCAGATGA 132 0 

GAACTGCTTT GAGGATGCGG TCTACTGGGA GATGCCTGAA GGGAACAGCA CTGTCTACAT 1380 

CCCTGAGGAC CCCACCTTCA AGCCCTCAGG ACCCTCAGIA CCCTCAC-TAC CCATGGTGTC 1440 

CCCACTACCC ATGGCTTCCT CGGTACCCTT GGTACCCTAG GCTCCCAGGG CAGGTGAGGA 1S00 

A CAGACTSTCC TGGGGCCCAG AGGAGCTGGT GCTGACCTAG CTAGGGATGC 1560 

C AAGCAGCACC CCCACC1GCT GGCTTGGCCT CAAGGTGCCT CCACCCCTGC 1620 

CCTCCCCTTC ATCCCAGGGG GTCTGMCTGA GAATGGAGAA GGAGAAGCTA CAAAGTGGGC 1680 

ATCCAAGCCG GGTTCTGGCT GCAGAAGTTC TGCCTCTGCC TGGGGTCTTG GCCACATTGG 1740 

AGAAAAACAG GCTCAAAGTG GGGCTGGGAC CTGGTGGGTG AACCTGAGCT CTCCCAGGAG 1800 

ACAACTTAGC TGCCAGTCAC CACCTATGAG GCTCTTCTAC CCCGTGCCTG CACCTCGGCC 1860 

AGCATCTCCT ATGCTCCCTG GGTCCCCCAG ACCTCTCTGT GTTGTGTGCG TGGCAGCCTC 1920 
CAGGAATAAA CATTCTTGTT GTCCTTTGTA AAAAAAAAAA AAAAAAAA 

Seq ID NO: 257 Protein sequence: 
Protein Accession #: NPJD57405 

1 11 21 31 41 51 

MAWNTNLRWR LPLTCLLLQV IMVILFGVPV RYDFEADAHW WSERTHKNLS DMENEFYYRY 60 

PSFQDVHVMV FVGFGFLMTF LQRYGFSAVG FKFLLAAFGI QWALLMQGWF HFLQDRYIW 12 0 

GVENLINADF CVASVCVAFG AVLGKVSPIQ LLIMTFFQVT LFAVNEFILL NLLKVKDAGG 180 

SHTIHTFGAY FGLTVTRILY RRNLEQSKER QNSVYQSDLF AMIGTLFLWM YWPSFKSAIS 240 

YHGDSQHRAA INTYCSLAAC VLTSVAISSA LHKKGKLDHV HIQKATLAGG VAVGTAAEMM 300 

3 FVCGIISTLG FVYLTPFLES RLHIQDTCGI NNLHGIPGII GG I VGAVTAA 360 

E GLVHSFDFQG FNGDWTARTQ GKFQIYGLLV TLAMALMGGI IVGLILRLPF 420 
F EDAVYWEMPE GNSTVYIPED PTFKPSGPSV PSVPMVSPLP MASSVPLVP 

Seq ID NO: 258 DNA sequence 



1 11 21 31 41 51 

| | I I I I 

GGGAAGTGCT GTTGGAGCCG CTGTGGTTGC TGTCCGCGGA GTGGAAGCGC GTGCTTTTGT SO 

TTGTGTCCCT GGCCATGGCG CTGCAGCTCT CCCGGGAGCA GGGAATCACC CTGCGCGGGA 120 

GCGCCGAAAT CGTGGCCGAG TTCTTCTCAT TCGGCATCAA CAGCATTTTA TATCAGCGTG 180 

GCATATATCC ATCTGAAACC TTTACTCGAG TGCAGAAATA CGGACTCACC TTGCTTGTAA 240 

CTACTGATCT TGAGCTCATA AAATACCTAA ATAATGTGGT GGAACAACTG AAAGATTGGT 300 

TATACAAGTG TTCAGTTCAG AAACTGGTTG TAGTTATCTC AAATATTGAA AGTGGTGAGG 350 

TCCTGGAAAG ATGGCAGTTT GATATTGAGT GTGACAAGAC TGCAAAAGAT GACAGTGCAC 420 

CCAGAGAAAA GTCTCAGAAA GCTATCCAGG ATGAAATCCG TTCAGTGATC AGACAGATCA 4 80 

A CGGT GAi ATTTrTG CCACTGTTGG AAGTTTCTTG TTCATTTGAT CTGCTGATTT 540 

ATACAGACAA AGATTTGGTT GTACCTGAAA AATGGGAAGA GTCGGGACCA CAGTTTATTA 600 

CCAATTCTGA GGAAGTCCGC CTTCGTTCAT TTACTACTAC AATCCACAAA GTAAATAGCA 660 

TGGTGGCCTA CAAAATTCCT GTCAATGACT GAGGATGACA TGAGGAAAAT AATGTAATTG 720 

TAATTTTGAA ATGTGGTTTT CCTGAAATCA GGTCATCTAT AGTTGATATG TTTTATTTCA 780 

TTGGTTAATT TTTACATGGA GAAAACCAAA ATGATACTTA CTGAACTGTG TGTAATTGTT 840 

CCTTTATTTT TTTGGTACCT ATTTGACTTA CCATGGAGTT AACATCATGA ATTTATTGCA 900 

CATTGTTCAA AAGGAACCAG GAGGTTTTTT TGTCAACATT GTGATGTATA TTCCTTTGAA 960 

GATAGTAACT GTAGATGGAA AAACTTGTGC TATAAAGCTA GATGCTTTCC TAAATCAGAT 1020 

GTTTTGGTCA AGTAGTTTGA CTCAGTATAG GTAGGGAGAT ATTTAAGTAT AAAATACAAC 1080 

AAAGGAAGTC TAAATATTCA GAATCTTTGT TAAGGTCCTG AAAGTAACTC ATAATCTATA 1140 

AACAATGAAA TATTGCTGTA TAGCTCCTTT TGACCTTCAT TTCATGTATA GTTTTCCCTA 1200 

TTGAATCAGT TTCCAATTAT TTGACTTTAA TTTATGTAAC TTGAACCTAT GAAGCAATGG 1260 

ATATTTGTAC TGTTTAATGT TCTGTGATAC AGAACTCTTA AAAATGTTTT TTCATGTGTT 1320 

TTATAAAATC AAGTTTTAAG TGAAAGTGAG GAAATAAAGT TAAGTTTGTT TTAAAAAAAA 1380 
AAAAAAAAAA 



MALQLSREQG ITLRGSAEIV AEFFSFGINS ILYQRGIYPS ETFIRVQKYG LTLLVTTDLE 

LIKYLNNWE QLKDWLYKCS VQKLVWISN IESGEVLERW QFDIECDKTA KDDSAPREKS 

QKAIQDEIRS VIRQITATVT FLPLLEVSCS FDLLIYTDKD LWPEKWEES GPQFITNSSE 
VRLRSFTTTI 



I I I I I I 

AAAGGCCTGC AGCAGGACGA GGACCTGAGC CAGGAATGCA GGATGGCGGC GGTGAAGAAG 
GAAGGGGG1G CTCTGAGTGA AGCCATGTCC CTGGAGGGAG ATGAATGGGA ACTGAGTAAA 
GAAAATGTAC AACCTTTAAG GCAAGGGCGG ATCATGTCCA CGCTTCAGGG AGCACTGGCA 
CAAGAATCTG CCTGTAACAA TACTCTTCAG CAGCAGAAAC GGGCATTTGA ATATOAAATT 
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CGATTTTACA CTGGAAATGA CCCTCTGGAT GTTTGGGATA GGTATATCAG CTGGACAGAG 
C CTCAAGGTGG GAAAGAGAGT AATATGTCAA CGTTATTAGA AAGAGCTGTA 

AAATTAGGGC GTTTATGCAA TGAGCCTTTG GATATGTACA GTTACTTC-CA CXACCAAGG3 
ATTGGTGTTT CACTTGCTCA GTTCTATATC TCATGGGCAG AAGAATATGA AGCTAC-AGAA 
AACTTTAGGA AAGCAGATGC GATATTTCAG GAAGGGATTC AACAGAAC-GC TC-AACCACTA 
: AGTCCCAGCA CCGACAATTC CAAGCTCGAG TGTCTCGC-CA AACTCTGTTG 
A GGAGGAAGTT TTTGAGTCTT CTGTACCACA ACGAAGCACA 



PCT/US02/12476 



GGTGCTCTCA 
TTGTCTAAGC 



3 CCAGAACAGA 



ACAGCTTCAC 



CAGGCCCTTG 



AGCACCAGAA 
CAAGCGTCTG 
GTAGGGGAAT 
CAAAGGGAAG 
GAAGAGATGG 
CAGCAAGAAG 



GAAACTTCAC 
CCTTTCTCCA 
GATCCCCCAC 
ATCACCTCAA 
TTGAGCGAGG 
GACACTTGTG 



AGCCAGTTAT 
AGCCTGGAAA 
AGGAGAAGAA 
TCTCCTTTGA 
CCGAGCTATT 
AGAAGAAGCT 
AGACGATGCC 
CAGGAATGAC 
TTGCGGAGAA 

GAGTTTTAGC 
ATGAAGATGT 
ATGCCATTAT 
ACTTTGCCAG 
ATCTCCCTTC 
AGGACCAGCA 
GCCCAATTAT 
CCTCGGTTGC 
CTAATGAGAC 



GCCATGGATA 
GAACACAGGC 
ACCCGCTGTG 
GACACCATGT 
GGAAGAAGGA 
AGAGAAGATG 



GAAAATGCTG 



ATCCATTTCC 



AGGTCCTTGG 



TGCCCAGGGC 



TCACTCCATA 



GATCCTCTAC 



GACCAGTGCA 
AAAAGAAATC 
TACAAAGGAG 
TCTATCCAGT 



GCTGAAGTTT 



AGGAGAAGAT TTATGCAGGA 



ATCAACCTCA 
CAATATCAAG 
CTTCTCCAAC 
TTGACAATAG 
TGTCTGATTC 
TTGAAGATAG 
CTCAGCGGCT 
TCTCCCTACC 



TGCCTAAGTT 
GAGAATACCT 
AATTAACAGT 
AGTTAAAGGA 
ATGGCIGTAT 
ACAGTGAATA 
TGGAGATGCT 
TCAGAAACAG 
TGGACTTTTC 
TTCGGACTGT 



GTTTCTTCTT 
TCAACGAAGA 
GTCTCCAGAT 
CACAGGCTTC 
AGCAGCTCGT 
TGATCCTGAG 
GACAGCTTGT 
TGAAGACAGT 
AAGCACCTCC 
TTCAGAAAAC 
CCTACCAGAG 
GGAAATTGAG 
AATATGTGAA 
AATAAAGGTA 



CAAACTACTC 
ACAACTAAAC 
TCTGTTTGTC 
GAACAACCTC 
TCAGAAAAGA 

ccccrrGCAG 



TGCAAATTGC 



AACAGGTGAT 



AGAAATGTAA 
TTTGTATCCA 
AGACTGTTAC 
GGCACTATCT 
CGTGAAGCCA 
TCCATCAAAT 
CCTACTCAGT 



TGTTTGGCAC 



AAGGAAATTG 
GATTACAAGT 
TCTTCTCAAC 
GAAGATTTTG 



ATTCTAAAGG 
AGAATAAAAG 
TTCTCAAAAC 
AATTTACAGG 
CAATTTGTCC 
CTCCTTTTCA 
CGGAAGAAGA 
ACAGTCAGAC 
CACACTCCTC 
GTCTTCAAAT 
CACCATGGTG 
CTGCAGAGTT 
AATTAGGTAA 
TATTCTGGGT 



TTGTGCCAGA 
TCCCAGTGTA 
TCCTCCTGCA 
CTCAGAAAGC 
AATTGAACCC 
TAACCCAGAA 



GGACTTTTAT 



CTAAAAGATG 



GTATTGTGGA 
CTACCATTGC 
ACAGTGATAT 
CAGACTCATT 
CTTTTCCCAT 



GTGAATTGTG 
CTGTTCTTGG 
ACCTGAACAA 
AGTGAGCTAG 
ACACTGAAAC 
TGTTCTACTT 



ACACAAAGCA 
AATCCACGAT 
CTACAGTGTT 
ACAGATCCTG 
GTTTGGTATA 
GGATGGGTCC 
GAATAAATTC 
GGAGCTTGCA 
AGCCTTATGG 
GCAATCAAGT 
TGTATGTGCT 
TTTGGTACAG 



GAAATAACAG 
GAAATAGTCC 
CCCTATGATT 
GACCTTAGGG 



TTCTGGAAAC 



ACTGCTTCAC 
TGITGATTAT 

ATGGTGACTT 
GTAACAAGAA 
TGCAGCTGGA 
AGATCCTGGC 
CACATTTACT 
TTAGCCAAAA 



GCAGAAATGA A 



TCTCAGCATC 
TGGCTTCTCT 
TCCTGAGAAA 
TTCACAGTAT 
GTGTATAGAA 
TGAGGATTAC 



CAATCAAGCT 
TGTTTTTACC 
TAACTGTTCT 
ATTGTTCAAG 
TATTTCTGAG 



2820 
2880 
2940 
3C00 



GTAATTTAAT TTAGGACACA TTTAGATGCA 
GTATATTTTG ACGTCACTGA TATTTTTTAT 
AACTTTTGTG AAGAACTATT TTATTCTAAA 
TTAACCCATT TGTCTCTACT TTTCCCTGTA 
ACCATGTATT TTGTAAATAA 



I 

MAAVKKEGGA 
AFEYEIRPYT 
FLNLWLKLGR 
QKAEPLERLQ 
PIIRVGGALK 
PRAKENELQA 



GNDPLDVWDR 
LCNEPLDMYS 
SQHRQFQARV 
APSQNRGLQN 
GPWNIGRSLE 



YLHNQGIGVS 
SRQTLLALEK 
PFPQQMQNNS 
HRPRGNTASL 
RVQSHQQASE 
EMQKQIEEHE 



PLRQGRIMST 
QGGKESNMST 
LAQFYISWAE 

RITVFDENAD 



LLERAVEALQ 
EYEARSNFRK 
VPQRSTLAEL 
EASTASLSK? 
TPYVESTAQQ 



GEKRYYSDPR 
ADAIFQEGIQ 
KS KGKKTARA 
TVQPWIAPPM 
PVMTPCKIEP 3 60 



ICPNPEDTCD 
SQTLSIKKLS 
PWCSQYRRQL 



PIIEDSREAT H 



CFTLQDLLQH 
NKNNQALKIV 
HLLLFKEHLQ 
GVFDTTFQSH 



LIIYNLLTIV 
QLDVFTLSGF 
SQNISELKDG 
LTSPGALLFQ 



EKKEKMMYCK EKIYAGVGEF SFESIRASVF 420 
KKLKEIQTTQ QERTGDQQES 
SKC-PSVPFSI 
FTGIEPLSED 
EEDLDVKTSS 
LQIPEKLELT 
LGNEDYCIKR 
HFCSCYQYQD 
GDLSPRCLIL 
ILANCSSPYQ 

LNANDSATV3 VLGELAAEMN 1020 



LKERLNEDFD 
EHLHKAEIVK 
RTVQILEGQK 
ELWNKFFVRI 



FDEFLLSSKK 
AI ITGFRNVT 
DQQTACGTIY 
NET8ENPTQS 
EYLICEDYKL 
GCIVWHQYIN 
RNRIHDPYDC 



2 62 DNA sequence 
d Accession It: NH_003784 
: 365. .1507 



287 



WO 02/086443 PCT/US02/12476 
I I I I I 

- - GC AGAGTGCAGG CTGCACCTTT GGACAGCCTT 60 

T TTTAGAACAA ATTTTTGTCT AGAAATGCTG ACTTTGGTTC 12 0 

CG AAGCTCTCCT TCATCACCTT CCTAAGTGCA 180 

TGTACAGGGA AGCTCTCCTT CATCACCTTC CTAAGTGCAT GC-GGGAAAAT ACCTAGGGCT 240 

CAACAGTCTT GAGAAGTGTG GAAACATTTT CTTTGTGAGT GAGAACAGAT CACCTAGAGA 300 

AAGGAAACCA GATTCCCATC ACTGCTTCTG GGTATCAGAT GCTAGCGCTG CACTCCATTT 360 

TGCAATGGCC TCCCTTGCTG CAGCAAATGC AGAGTTTTGC TTCAACCTGT TCAC-AGAGAT 42 0 

GGATGACAAT CAAGGAAATG GAAATGTGTT CTTTTCCTCT CTGAGCCTCT TCGCTGCCCT 480 

GGCCC1GGTC CGCTTGGGCG CTCAAGATGA CTCCCTCTCT CAGATTGATA AGTTGCTTCA 540 

TGTTAACACT GCCTCAGGAT ATGGAAACTC TTCTAATAGT CAGTCAGGGC TCCAGTCTCA 600 

ACTGAAAAGA GTTTTTTCTG ATATAAATGC ATCCCACAAG GATTATGATC TCAGCATTGT 660 

GAATGGGCTT TTTGCTGAAA AAGTGTATGG CTTTCATAAG GACTACATTG AGTGTGCCGA 72 0 

AAAATTATAC GATGCCAAAG TGGAGCGAGT TGACTTTACG AATCATTTAG AAGACACTAG 780 

ACGTAATATT AATAAGTGGG TTGAAAATGA AACACATGGC AAAATCAAGA ACGTGATTGG 840 
TGAAGGTGGC ATAAGCTCAT CTGCTGTAAT GGTGCTGGTG AATGCTGTGT A 
CAAGTGGCAA TCAGCCTTCA CCAAGAGCGA AACCATAAAT TGCCATTTCA A 

GTGCTCTGGG AAGGCAGTCG CCATGATGCA TCAGGAACGG AAGTTCAATT TGTCTGTTAT 102 0 

TGAGGACCCA TCAATGAAGA TTCTTGAGCT CAGATACAAT GGTGGCATAA ACATGTACGT 1080 

TCTGCTGCCT GAGAATGACC TCTCTGAAAT TGAAAACAAA CTGACCTTTC AGAATCTAAT 1140 

GGAATGGACC AATCCAAGGC GAATGACCTC TAAGTATGTT GAGGTATTTT TTCCTCAGTT 12 00 

CAAGATAGAG AAGAATTATG AAATGAAACA ATATTTGAGA GCCCTAGGGC TGAAAGATAT 1260 

CTTTGATGAA TCCAAAGCAG ATCTCTCTGG GATTGCTTCG GGGGGTCGTC TGTATATATC 1320 

AAGGATGATG CACAAATCTT ACATAGAGGT CACTGAGGAG GGCACCGAGG CTACTGCTGC 1380 

CACAGGAAGT AATATTGTAG AAAAGCAACT CCCTCAGTCC ACGCTGTTTA GAGCTGACCA 1440 

CCCATTCCTA TTTGTTATCA GGAAGGATGA CATCATCTTA TTCAGTGGCA AAGTTTCTTG 1500 

CCCTTGAAAA TCCAATTGGT TTCTGTTATA GCAGTCCCCA CAACATCAAA GRACCACCAC 1S60 

AAGTCAATAG ATYTCRGTTT AATTGGAAAA ATGTGGTGTT TCCTTTGAGT TTATTTCTTC 162 0 

CTAACATTGG TCAGCAGATG ACACTGGTGA CTTGACCCTT CCTAGACACC TGGTTGATTG 1680 

TCCTGATCCC TGCTCTTAGC ATTCTACCAC CATGTGTCTC ACCCATTTCT AATTTCATTG 1740 

TCTTTCTTCC CACGCTCATT TCTATCATTC TCCCCCATGA CCCGTCTGGA AATTATGGAG 1800 

RGTGCTCAAC TGGTAAGGAG AACGTAGAAG TAGCCCTAGG GATCCTTTTT GAAACTCTAC 1860 

AGTTATCGCA GATATTCTAG CTTCATTGTA AGCAATCTAG GAAATAAGCC CTGCTGCTTT 192 0 

CTAGAAATAA GTGTGAAGGA TAAATTTTCT TTGTTGACCT ATGAAGATTT TAGAGTTTAC 1980 

CTTCATATGT TTGATTTTAA ATCAGTGTAT AATCTAGATG GTAAAAAATG TGAAATTGGG 2040 

ATTAGGGACC TACCAAAATA TTTCATTAAT GCTTTCAATT GACAAATTTT GGCCTTTCTT 2100 

TGATAAGACA ATATGTACAT GTTTTTTCAA ATATTAAAGA TCTTTTAACT GTTGGCAGTT 2160 

GTTATCTACA GAATCATATT TCATATGCTG TGTAGTTTAT AAGTTTTTCC TCTATTTATC 2220 
AGAATAAAGA AATACAACAT ACCTGTAAA 

Seq ID NO: 263 Protein sequence: 



I I I I 

E FCFNLFREMD DNQGNGNVFF SSLSLFAALA LVRLGAQDDS IiSQIDKLLHV 60 

S NSQSGLQSQL KRVFSDINAS HKDYDLSIVN GLFAEKVYGF HKDYIECAEK 120 

D FTNHLEDTRR NINKWVENET HGKIKNVIGS GGISESAVMV LVKAVYFKGK 180 

WQSAFTKSET INCHFKSPKC SGKAVAMMHQ ERKFHLSVIS DPSMKILSLR YNGGINKYVL 240 

LPENDLSEIE NKLTFQNLME WTNPRRMTSK YVEVFFPQFK IEKNYEMKQY LRALGLKDIF 300 

DESKADLSGI ASGGRLYISR MMHKSYIEVT EEGTEATAAT GSNIVEKQLP QSTLFRADHP 360 
FLFVIRKDDI ILFSGKVSCP 

Seq ID HO: 264 DMA sequence 
Nucleic Acid Accession #: AB0S2906 
Coding sequence: 74-814 

1 11 21 31 41 51 

I I I I I I 

AAAACCTTGA GGTGATTCAT CTTCCAGGCT CTCCTTCCAT CAAGTCTCTC CTCCCTAGCG 60 

CTCTGGGTCC TTAATGGCAG CAGCCGCCGC TACCAAGATC CTTCTGTGCC TCCCGCTTCT 120 

GCTCCTGCIG TCCGGCTGGT CCCGGGCTGG GCGAGCCGAC CCTCACTCTC TTTGCTATGA 180 

CATCACCGTC ATCCCTAAGT TCAGACCTGG ACCACGGTGG TGTGCGGTIC AAGGCCAGGT 240 

GGATGAAAAG ACTTTTCTTC ACTATGACTG TGGCAACAAG ACAGTCACAC CIGTCAGTCC 300 

CCTGGGGAAG AAACTAAATG TCACAACGGC C1GGAAAGCA CAGAACCCAG TACTGAGAGA 360 

GGTGGTGGAC ATACTTACAG AG CAACTGCG TGACATTCAG CTGGAGAATT ACACACCCAA 420 

GGAACCCCTC ACCCTGCAGG CCAGGATGTC TTGTGAGCAG AAAGCTGAAG GACACAGCAG 480 

TGGATCTTGG CAGTTCAGTT TCGATGGGCA GATCTTCCTC CTCTTTGACT CAGAGAAGAG 540 

AATGTGGACA ACGGTTCATC CTGGAGCCAG AAAGATGAAA GAAAAGTGGG AGAATGACAA 600 

GGTTGTGGCC ATGTCCTTCC ATTACTTCTC AATGGGAGAC TGTATAGGAT GGCTTGAGGA 660 

CTTCTTGATG GGCATGGACA GCACCCTGGA GCCAAGTGCA GGAGCACCAC TCGCCATGTC 720 

CTCAGGCACA ACCCAACTCA GGGCCACAGC CACCACCCTC ATCCTTTGCT GCCTCCTCAT 780 

CATCCTCCCC TGCTTCATCC TCCCTGGCAT CTGAGGAGAG TCCTTTAGAG TC-ACAGGTTA 840 

AAG CTGATAC CAAAAGGCTC CTGTGAGCAC GGTCTTGATC AAACTCGCCC TTCTGTCTGG 900 

CCAGCTGCCC ACGACC T ACG GTGTATGTCC AGTGGCCTCC AGCAGATCAT GATGACATCA 960 

TGGACCCAAT AGCTCATTCA CTGCCTTGAT TCCTTTTGCC AACAATTTTA CCAGCAGTTA 1020 

TACCTAACAT ATTATGCAAT TTTCTCTTGG TGCTACCTGA TGGAATTCCT GCACTTAAAG 1080 

TTCIGGCTGA CTAAACAAGA TATATCATTT TCTTTCTTCT CTTTTTGTTT GGAAAATCAA 1140 

GTACTTCTTT GAATGATGAT CTCTTTCTTG CAAATGATAT TGTCAGTAAA ATAATCACGT 1200 

TAGACTTCAG ACCTCTGGGG ATTCTTTCCG TGTCCTGAAA GAGAATTTTT AAATTATTTA 1260 

h ATTTATATTA ATGATTGTTT CCTTTAGTAA TTTATTGTTC TGTACTGATA 1320 
A GAGTTCTATT TCCCAAAAAA AAAAAAAAAA A 

Seq ID NO: 265 Protein sequence: 



288 



WO 02/086443 



PCT/US02/12476 



LjPLLLLLS GWSRAGRADP HSLCYDITVI PKFRPGPRWC AVQGQVDEKT 
FLHYDCGNKT VTPVSPLGKK LNVTTAWKAQ NPVLREWDI LTEQLRDIQL ENYTPKEPLT 
LQARMSCEQK AEGHSSGSWQ FSFDGQIFLL FDSEKEMWTT VHPGARKMKE KWEISDKWAM 
C IGWLEDFLMG MDSTLEPSAG APLAMSSGTT QLRATATTLI LCCLLIILPC 



Seg ID NO: 266 DNA sequence 

Nucleic Acid Accession ft: XH_0848S3.1 

Coding sequence: 12 7-444 



I 



I 



ATTGATGATA TATTTAACGA AATCAAATTT GGTGAATATG TGGACACTGG AAAGCTAATC 
GACAAGATCA ACTTACCAGA TTTCCTAAAA GTGTACCTTA ACCACAAGCC ACCTTTTGGT 
AACACCATGA GTGGCATCCA CAAGAGCTTT GAGGTGCTCG GTTATACCAA CTCCAAAGGG 
AAAAAGGCCA TTCGAAGAGA GGACTTCCTG AGACTGCTCG TTACTAAAGG TGAGCATATG 
ACGGAGGAGG AGATGTTGGA TTGCTTTGCT TCACTGTTTG GCCTGAATCC CGAGGGATGG 
AAATCCGAGC CTGCAACCTG CTCCGTCAAA GGTTCAGAAA TTTGCCTTGA AGAAGAACTT 
CCAGACGAAA TCACTGCAGA AATATTCGCG ACTGAAATTC TTGGCTTAAC CATTTCAGAA 
GATTCCGGCC AGGATGGTCA GTGAAGTTAC CAGGAATGTT TAAAGCACAA AGGACTTTGG 
GTGTGTGTGC ATGCACATGT GTGTGTTTTC CATGAGGCAC TGCTTTTTAT GCATTTCCCT 
CCCCCCTCTC ATCTTTAGAA CATTTAGACA TTAAAGCAAG TTTCTGGTGA GCAATG 



iRREDFLRL LVTK 



EITAEIFATE ILGLTISEDS GQDGQ 



Seq ID NO: 268 DNA set 
Nucleic Acid Accessioi 
Coding sequence: 57-482 



I 



CTCCTCTCCT 



I 



I 



m " T rCTGAG GAGAC rt l^G 

CCCAGTATCT GAGTACCCTG CTGCTCCTGC TGGCCACCCT AGCTGTGGCC CTGGCCTGGA 
GCCCCAAGGA GGAGGATAGG ATAATCCCGG GTGGCATCTA T. 
AGTGGGTACA GCGTGCCCIT CACTTCGCCA T 
ACTACTACAG ACGTCCGCTG CGGGTACTAA G 
ATTACTTCTT CGACGTAGAG GTGGGCCGCA CCATATGTAC C 
ACACCTGTGC CTTCCATGAA CAGCCAGAAC T< 
TCTACGAAGT TCCCTGGGAG AACAGAAGGT C 
AGGGATCTGT GCCAGGCCAT TCGCACCAGC CACCACCCAC TCCCACCCCC TGTAGTGCTC 
CCACCCCTGG ACTGGTGGCC CCCACCCTGC GGGAGGCCTC CCCATGTGCC TGCGCCAAGA 
GACAGACAGA GAAGGCTGCA GGAGTCCTTT GTTGCTCAGC AGGGCGCTCT GCCCTCCCTC 
CTTCCTTCTT GCTTCTAATA GCCCTGGTAC ATGGTACACA CCCCCCCACC TCCTGCAATT 



CCCAACTTGG 



60 MAQYLSTLLL LLATLAVALA WSPKEEDRII PGGIYNADLN DEWVQRALHF A 

DDYYRRPLRV LRARQQTVGG VNYFFDVEVG RTICTKSQPN LDTCAFHEQP ELQKKQLCSF 
EIYEVPWENR RSLVKSRCQE S 

Seq ID NO: 270 DNA sequence 
65 Nucleic Acid Accession it: XMJ393210 
Coding sequence: 13-1854 



I 



I 



41 



ATGGCAAGCG CCGGAATCTC CTCAGCTGCC GT1TCACAAA AGAGGTACCA GGTCCGCACC 
AAACGAGCAC ACAAGCAGCA CCAGGAGCTG CAGAAGAAGG AGGCGGCAGC GATGGACCAG 
GGCAGAGGGA ATGGGGAGGG GGCATCCTAC CCCATATCTG AGGTGCGACT GCGGGACGTA 
GAGCGGACTG GGCCTTTCCC GTTGGCGCGT GGCCTCAATC AGGACTTCTT GCCCACGTGC 
GCCTTCAAAA CGGTAAGAGC TGCAACTGAA CGTGTGAGAC ATGGTGCAGA TAGGCTGAGA 
GGCGGCGGGA GAGATGCCCA TGAACTCAAG TACCCGGACA CGCCCTCCAC TTCTACCACC 
ACGAGTAACA CCGCCCCCAC GGGACCGCTC TCGAGGTCCC CCAAGCCAAG GACGCAAGGA 
GGAACGCCCC GGCGCGCGGC CAGCAGCGGC GGGCACCGGC CCAATGGCCA CGGAACICAG 
CACTGGCAGT CGGCCCTCCT CACACCGCAG GCGTGCAGTG TGGCCGACGG AGCCTCCCGG 
GCCGAGGACC CAGCTAGGCC GTCACCCCGG TTGCTCCCAC GGGAAGGGGC ACCAGGCAAA 
CTGCCCAAGG CCCCGAGCCC AGGCTCCCTG GCGGAGGCCT CCGCTGGTCC CGCCCAGATC 
ATGGCCGCCA CCAGGCTCCC GAGCCATGGC TTCCTGTCCG GGAACGGCCC GGCGTCCTGG 
CTGTCCAGCT AG 



Protein sequenc 



289 



WO 02/086443 PCT/US02/12476 

I I I I I I 

MLRHGEQKRK RAHKKWDPLP TCAFKTVRAA TERVRHGADR LRGGGRDAHE LKYPDTFSTS 60 

TTTSNTAPTG PLSRSPKPRT QGGTPRRRPA AAGTRANGHG TQHWQSALLT PQACSVADGA 120 

SRAEDPARPS PRLLPREGAP GKLPKAPSPG SLAEASAGLL AHVRLQNADA QRVSISQALP 180 

PNSSVGRKEE RPGAGQQRRA PAPMATELST GSRPSSHRRR AVWPTEPPC-P RTQLEPSPRL 240 

LPREGAPGKL PKAPSPGSLA EASAGPAQIH AATRLPSRGP LSGNGPASWL SS 



3 GAGAGAGAAC AGGGAGGGTA G 
TGAAAAAGCT TTTTTTCCCA CTTTTAACTT GCTTTAGCGT TAAGAGTACT TACCAGCTAA 
G G AAATT ATT C TTTCTCATTG GAGATTACAG AATATATCTA T' 



ATGATTTTGT CTTGTTTCTG CAGTGAGAAA TTACATCCAT AGCAAAGACA AAAGTCTTTT 
T TTATTTATCT TTCATATAGT TCTTACAATT TCTAAAAAAT TAACACTCAT 
3 AGAGGGTTTT TTGTAT 



MGGRENREGR DAFEKAFFPT FNLL 



Coding sequence: 299-961 

1 11 21 31 41 51 

CTCTGAGCTT CTCTGAGCCT TGTTTGCTCA TCTGGAAAAA GGGGATTAAA CCATTTACCT SO 
CATGGAGTTG TGAAAGAATA GCTGCAAAGC ACCTAACACA TAGTAAGGTT CCCAGTGCAG 120 
40 CTACTTCTGC TGGGTTGAGT CTAGCTGTGT AGGCCCCTTG TTCCTCACCT GGAGAAACTG 180 
GGGTGGCAGG CCGGTCCCCC ACAAAAGATA ACTCATCTCT TAATTTGCAA GCTGCCTCAA 240 
CAGGAGGGTG GGGGAACAGC TCAACAATGG CTGATGGGCG CTCCTGGTGT TGATAGAGAT 3 00 
GGAACTTGGA CTTGGAGGCC TCTCCACGCT GTCCCACTGC CCCTGGCCTA GGCGGCAGCC 350 
TGCCCTGTGG CCCACCCTGG CCGCTCTGGC TCTGCTGAGC AGCGTCGCAG AGGCCTCCCT 42 0 
45 GGGCTCCGCG CCCCGCAGCC CTGCCCCCCG CGAAGGCCCC CCGCCTGTCC TGGCGTCCCC 4 80 

CGCCGGCCAC CTGCCGGGGG GACGCACGGC CCGCTGGTGC AGTGGAAGAG CCCGGCGGCC 540 
GCCGCCGCAG CCTTCTCGGC CCGCGCCCCC GCCGCCTGCA CCCCCATCTG CTCTTCCCCG 600 
CGGGGGCCGC GCGGCGCGGG CTGGGGGCCC GGGCAGCCGC GCTCGGGCAG CGGGGGCGCG 660 
GGGCTGCCGC CTGCGCTCGC AGCTGGTGCC GGTGCGCGCG CTCGGCCTGG GCCACCGCTC 720 
50 CGACGAGCTG GTGCGTTTCC GCTTCTGCAG CGGCTCCTGC CGCCGCGCGC GCTCTCCACA 780 
CGACCTCAGC CTGGCCAGCC TACTGGGCGC CGGGGCCCTG CGACCGCCCC CGGGCTCCCG 840 
GCCCGTCAGC CAGCCCTGCT GCCGACCCAC GCGCTACGAA GCGGICTCCT TCATGGACGT 900 
CAACAGCACC TGGAGAA CCG TGGACCGCCT CTCCGCCACC GCCTGCGGCT GCCTGGGCTG 960 

AGGGCTCGCT CCAGGGCTTT GCAGACTGGA CCCTTACCGG TGGCTCTTCC TGCCTGGGAC 1020 

55 CCTCCCGCAG AGTCCCACTA GCCAGCGGCC TCAGCCAGGG ACGAAGGCCT CAAAGCTGAG 1080 

AGGCCCCTAC CGGTGGGTGA TGGATATCAT CCCCGAACAG GTGAAGGGAC AACTGACTAG 1140 

: TGCGGATCCC AGCCTAAAAG ACACCAGAGA CCTCAGCTAT 12 00 



CC CCCGCCCAGG CCCTGTAGGG 132 0 
ACAGCATTTG AAGGACACAT ATTGCAGTTG CTTGGTTGAA AGTGCCTGTG CTGGAACTGG 13 80 

3 CTGGCCCC 



HELGLGGLST LSHCPWPRRQ PALWPTLAAL ALLSSVAEAS LGSAPRSPAP R 
PAGHLPGGRT ARWCSGRARR PPPQPSRPAP PPPAPPSALP RGGRAARAGG PGSRARAAGA 
. RGCRLRSQLV PVRALGLGHR SDELVRFRFC SGSCRRARSP HDLSLASLLG AGALRPPPGS 
RPVSQPCCRP TRYEAVSFMD VNSTWRTVDR LSATACGCLG 



I I I I I I 

ACTGGCCGCT GAGAGAAGAA TCGGGTGGAG CAGAGAGCAG CTGCTGCAGG GCAGACAC-CC 

GGACCCCCAA ATCTGCACGT ACCAGCAGTC AGCCGCCCCA CGCAGGGACC GGCTTACCCC 

TCGCTCCCCG CCCTCACTCA CTTTCTCCCG CCCTCGGCCC GGCCTCCCAG CTCTCTACTT 

CGCGTGTCTA CAAACTCAAC TCCCGGTTTC CGTGCCTCTC CACCGCTCGA GTTCTCTACT 

CTCCATATCC GAGGGGCCCC TCCCAGCATC TACCCCCCTC CCAACCTCGC- GGGACCTAGC 

CAAGCTAGGG GGGACTGGAT CCGACGGGTG GAGCAGCCAG GTGAGCCCCG AAAGGTGGGG 

3 GCGCTCCCAG CCCCACCCCG GGATCTGGTG ACGCTGGGGC TGGAATTTGA 

C GGGCAGGAGG CTGCTGAGGG ATGGAGTTGG GCCCGGCCCC 

CAGACAAGGC CCGGGGGCTC CGCCAGCAGC AGGTCCCTCG GGCCCCAGCC CTCGCTGCCA 



^3 j^;S ; // : ^: : ^ 



WO 02/086443 PCT/US02/12476 

CCCGGGCCTG GAGCCCCACA CCCGAGGGTG CAGACTGGCT GCCAAGGCCA CACITTTGGC 600 

TAAAAGAGGC ACTGCCAGGT GTACAGTCCT GGGCATGCGC TGTTTGAGCT TCGGGGGAGA 6S0 

GCCCAGCACT GGTCCCCGGA AAGGTGCCTA GAAGAACAAG GTGCAC-GACC CCGTGCTGCC 72 0 

TCAACAGGAG GGTGGGGGAA C AG CTCAACA ATGGCTGATG GGCGCTCCTG GTGTTGATAG 780 

5 AGATGGAACT TGGACTTGGA GGCCTCTCCA CGCTGTCCCA CTGCCCCTGG CCTAGGCGGC 840 

AGCCTGCCCT GTGGCCCACC CTGG CGCTC 1 T I ^CT GAGCAC-CGTC GCAGAGGCCT 900 

CCCTGGGCTC CGCGCCCCGC AGCCCTGCCC CCCGCGAAGG CCCCCCGCCT GTCCTGGCGT 960 

CCCCCGCCGG CCACCTGCCG GGGGGACGCA CGGCCCGCTG C-TGCAGTGGA AGAGCCCGGC 1020 

GGCCGCCGCC GCAGCCTTCT CGGCCCGCGC CCCCGCCGCC IGCACCCCCA TCTGCTCITC 10 80 

10 CCCGCGGGGG CCGCGCGGCG CGGGCTGGGG GCCCGGGCAG _ I ^LCGGGGG 1140 

CGCGGGGCTG CCGCCTGCGC TCGCAGCTGG TGCCGGTGCG CGCGCTCGGC CTGGGCCACC 12 00 

GCTCCGACGA GCTGGTGCGT TTCCGCTTCT GCAGCGGCTC CTGCCGCCGC GCGCGCTCTC 1260 

CACACGACCT CAGCCTGGCC AGCCTACTGG GCGCCGGGGC CCTGCGACCG CCCCCGGGCT 1320 

CCCGGCCCGT CAGCCAGCCC TGCTGCCGAC CCACGCGCTA CGAAGCGGTC TCCTTCATGG 1380 

15 ACGTCAACAG CACCTGGAGA ACCGTGGACC GCCTCTCCGC CACCGCCTGC GGCTGCCTGG 1440 

GCTGAGGGCT CGCTCCAGGG CTTTGCAGAC TGGACCCTTA CCGGTGGCTC TTCCTGCCTG 1500 

GGACCCTCCC GCAGAGTCCC ACTAGCCAGC GGCCTCAGCC AGGGACGAAG GCGTCAAAGC 1560 

TGAGAGGCCC CTACCGGTGG GTGATGGATA TCATCCCCGA ACAGGTGAAG GGACAACTGA 162 0 

CTAGCAGCCC CAGAGCCCTC ACCCTGCGGA TCCCAGCCTA AAAGACACCA GAGACCTCAG 1680 

20 CTATGGAGCC CTTCGGACCC ACTTCTCACA GACTCTGGCA CTGGCCAGGC CTCGAACCTG 1740 

GGACCCCTCC TCTGATGAAC ACTACAGTGG CTGAGGCATC AGCCCCCGCC CAGGCCCTGT 1800 

AGGGACAGCA TTTGAAGGAC A CAT ATTGCA GTTGCTTGGT TGAAAGTGCC TGTGCTGGAA 1860 

CTGGCCTGTA CTCACTCATG GGAGCTGGCC CC 

25 Seq ID NO: 2 77 Protein sequence: 
Protein Accession #: NP_003967.1 

1 11 21 31 41 51 

I I I I I I 

MELGLGGLST LSHCPWPRRQ PALWPTLAAL ALLSSVAEAS LGSAPRSPAP RSGPPPVLAS SO 

30 PAGHLPGGRT ARWCSGRARR PPPQPSRPAP PPPAPPSALP RGGRAARAGG PGSRARAAGA 120 

RGCRLRSQLV PVRALGLGHR SDELVRFRFC SGSCRRARSP HDLSLASLLG AGALRPPPGS 180 
RPVSQPCCRP TRYEAVSPMD VNSTWRTVDR LSATACGCLG 

Seq ID NO: 278 DNA sequence 
35 Nucleic Acid Accession #: NM_0S7160.1 
Coding sequence: 1-714 



60 



40 ATGCCCGGCC TGATCTCAGC CCGAGGACAG CCCCTCCTTG AGGTCCTTCC TCCCCAAGCC SO 

CACCTGGGTG CCCTCTTTCT CCCTGAGGCT CCACTTGGTC TCTCCGCGCA GCCTGCCCTG 12 C 

TGGCCCACCC TGGCCGCTCT CGCTCTGCTG AGCAGCGTCG CAGAGG^"! I I 1 

GCGCCCCGCA GCCCTGCCCC CCGCGAAGGC CCCCCGCCTG TCCIGGCGTC CCCCGCCGGC 240 

CACCTGCCGG GGGGACGCAC GGCCCGCTGG TGCAGTGGAA GAGCCCGGCG GCCGCCGCCG 3 0C 

45 CAGCCTTCTC GGCCCGCGCC CCCGCCGCCT GCACCCCCAT CTGCTCTTCC CCGCGGGGGC 3 6C 

CGCGCGGCGC GGGCTGGGGG CCCGGGCAGC CGCGCICGGG CAGCGGGGGC C-CCGCCTGC 42C 

CGCCTGCGCT CGCAGCTGGT GCCGGTGCGC GCGCTCGGCC TGGGCCACCG CTCCGACGAG 48C 

CTGGTGCGTT TCCGCTTCTG CAGCGGCTCC TGCCGCCGCG CGCGCTCTCC ACACGACCTC 540 

AGCCTGGCCA GCCTACTGGG CGCCGGGGCC CTGCGACCGC CCCCGGGCTC CCGGCCCGTC 600 

50 AGCCAGCCCT GCTGCCGACC CACGCGCTAC GAAGCGGTCT CCTTCATGGA CGTCAACAGC 660 

ACCTGGAGAA CCGTGGACCG CCTCTCCGCC ACCGGCTGCG GCTGCCTGGG CIGAGGGCTC 720 

GCTCCAGGGC TTTGCAGACT GGACCCTTAC CGGTGGCTCT TCCTGCCTGG GACCCTCCCG 780 

CAGAGTCCCA CTAGCCAGCG GCCTCAGCCA GGGACGAAGG CCTCAAAGCT GAGAGGCCCC 840 

TACCGGTGGG TGATGGATAT CATCCCCGAA CAGGTGAAGG GACAACTGAC TAGCAGCCCC 900 

55 AGAGCCCTCA CCCTGCGGAT CCCAGCCTAA AAGACACCAG AGACCTCAGC TATGGAGCCC 960 

TTCGGACCCA CTTCTCACAG ACTCTGGCAC TGGCCAGGCC TCGAACCTGG GACCCCTCCT 1020 

CTGATGAACA CTACAGTGGC TGAGGCATCA GCCCCCGCCC AGGCCCTGTA GGGACAGCAT 1080 

TTGAAGGACA CATATTGCAG TTGCTTGGTT GAAAGTGCCT GTGCTGGAAC TGGCCTGTAC 1140 
TCACTCATGG GAGCTGGCCC C 



Seq ID NO: 279 



MPGLISARGQ PLLEVLPPQA HLGALFLPEA PLGLSAQPAL WPTLAALALL SSVAEASLGS 
APRSPAPREG PPPVLASPAG HLPGGRTARW CSGRARRPPP QPSRPAPPPP APPSALPRGG 
RAARAGGPGS RARAAGARGC RLRSQLVPVR ALGLGHRSDE LVRFRFCSGS CRRARSPHDL 
SLASLLGAGA LRPPPGSRPV SQPCCRPTRY EAVSFMDVNS TWRTVDRLSA TACGCLG 



1 11 21 31 41 51 

I I I I I I 

CTGATGGGCG CTCCTGGTGT TGATAGAGAT GGAACTTGGA CTTGGAGGCC TCTCCACGCI 
GTCCCACTGC CCCTGGCCTA GGCGGCAGGC TCCACTTGGT CTCTCCGCGC AGCCTGCCCT 
GTGGCCCACC CTGGCCGCTC TGGCTCTGCT GAGCAGCGTC GCAGAGGCCT CCCTGGGCTC 
CGCGCCCCGC AGCCCTGCCC CCCGCGAAGG CCCCCCGCCT GTCCTGGCGT CCCCCGCCGG 
CCACCTGCCG GGGGGACGCA CGGCCCGCTG GTGCAGTGGA AGAGCCCGGC GGCCGCCGCC 
GCAGCCTTCT CGGCCCGCGC CCCCGCCGCC TGCACCCCCA TCTGCTCTTC CCCGCC-GGGC- 
CCGCGCGGCG CGGGCTGGGG GCCCGGGCAG CCGCCCT G ( 

CCGCCTGCGC TCGCAGCTGG TGCCGGTGCG CGCGCTCGGC CTGGGCCACC GCTCCGACGA 
GCTGGTGCGT TTCCGCTTCT GCAGCGGCTC CTGCCGCCGC GCGCGCTCTC CACACGACCT 
CAGCCTGGCC AGCCTACTGG GCGCCGGGGC CCTGCGA' 3 GGGCT CCCGGCCCGT 

CAGCCAGCCC TGCTGCCGAC CCACGCGCTA CGAAGCGGTC TCCTTCATGG ACGTCAACAG 
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CACCTGGAGA ACCGTGGACC GCCTCTCCGC CACCGCCTGC GGCTGCCTGG GCTGAGGGCT 72 0 

CGCTCCAGGG CTTTGCAGAC TGGACCCTTA CCGGTGGCTC TTCCTGCCTG GGACCCTCCC 7 80 

GCAGAGTCCC ACTAGCCAGC GGCCTCAGCC AGGGACGAAG GCCTCAAAGC TGAGAGGCCC 84 0 

CTACCGGTGG GTGATGGATA TCATCCCCGA ACAGGTGAAG GGACAACTGA CTAGCAGCCC 900 

CAGAGCCCTC ACCCTGCGGA TCCCAGCCTA AAAGACACCA GAGACCTCAG CTATGGAGCC S60 

CTTCGGACCC ACTTCTCACA GACTCTGGCA CTGGCCAGGC CTCGAACCTG GGACCCCTCC 1020 

TCTGATGAAC ACTACAGTGG CTGAGGCATC AGCCCCCGCC CAGGCCCTGT AGGGACAGCA 10 80 

TTTGAAGGAC ACATATTGCA GTTGCTTGGT TGAAAGTGCC TGTGCTGGAA CTGGCCTGTA 1140 
CTCACTCATG GGAGCTGGCC CC 

Seq ID NO: 2 81 Protein sequence: 



PCT/US02/12476 



T LSHCPWPRRQ APLGLSAQPA LWPTLAALAL LSSVA3ASLG SAPRSPAPRE 
GPPPVLASPA GHLPGGRTAR WCSGRARRPP PQPSRPAPPP PAPPSALPRG GRAARAGGPG 
SRABAAGARG CRLRSQ1.VPV RALGL.GHRSD ELVRFRFCSG SCRRARSPKD LSLASLLGAG 
ALRPPPGSRP VSQPCCRPTR YEAVSFMDVN STWRTVDRLS ATACGCLG 



I 



I 



CTACTGCACC TGCCCTCTGT TTCCTTTGGA AATCTCTTAC CTT7CATTAG GGTTTCTTTC 
ATAGCAATTT CCTTTGGTTT TTAAGACTTC TACATTGCTT TTTCTTTTAT TATCTGTGCT 
CCGTGAACCT TATGAATGCT GCTTAAAAAT AATGTCAAAA TATGTTTTAG CTGCCTACTC 
AGGTAACGTT TTCTTTTGCT CTCATCTTGG TTTCCATATA CTATTTTTGG TTTTTTGTGA 
GAT CTAAT CA ATGATCTAGT CAGAAGCTAC TTCACTGGCT AACAGTGATC ATGTTCATGT 
GCTAAAAATG AACTTGAAAC ACGGAAGTAG TGGTTGGTCC AGTTTGAAAG CTCTTATTAG 
TATTCTTCAT CCTGGCTGTA ATAATAGCCA TTATTTGTTA TGCCTTTG7T ATGTAGCAGA 
CACTCTTAAG GATTTTATGT GTATTATTCA AATTGCTATT ACTGTTCTTT TTATAGTTGA 
GAATCTCAGG ATACCTACAT TTATCACTTT TTCAATATAT ATGTATTTCT TATT 

Seq ID NO: 283 DNA S 
Nucleic Acid Accessi 
Coding sequence: 564-1481 



GAGACTTTTA 



CTTTCCCTGA 
GCAAACAAGG 
GGGGAGACAC 
AAGGGCTGGC 
CCTCCCGGCC 
GCCTGGCGCG 



ATCATCTATC 
AAAAATTCAC 
CAAAACACGC 
AGTTCCCTCC 
TCACAATTCT 
TTCAGTGGCA 
GACAGGCCCT 
CGTACTAAAC 
TAAGGGAGGC 
TCCACGCGTC 



CCTTGTGCTT TACGCAGACC C 
ATGTGTAGAC AAATTAGGTC C 
AATGACTGTC CTAAAAGTGC 
CCTCTCCTCA AAATATATCG 
GACCTCGTAA TTATATAGGG 
GGTAACATAT TTCATGTACA 
CAAAGTTGTC GGTAGGGAGC 
AAGCTTGCAA ACAGCAGGCA CCTTCCTGCC 



51 

I 

- CTAGAGGCTT 
CCAGGCAAAC 
ACACCTGTAA 
AAAGAAATCA 
GTTTCTGCGT 
CAACACCACG 
CCAGTGGCGT 
ACTGAGGAGG 



C CTGACAAAGT 540 



CCGGTTTCTC 
CTATGACGGG 
CGCTGAGCTG 



GGACGACGGT 



AGACCCAGGT 
ACCAGGTCGC 
TCAGAGGCCA 



GCTGGGGGCC 



AGGTGGTGCG CT( 



CCGCGGCTCC 
CTCGGACTCC 



GGATTGGCCG 
GCCAGTTTGA 
TCTTCCTACG 



AAGTGGCGAG 



CGTGGATGAT 



G AAAGCTCTAG 



AGGTGTGGTT 
AGGAATTAAA 
TAATTGCTGA 



AAACAGGTGC 
TATTGGCAAA GAAAAAGAAG 
TCAACAACTA GAAAAAAGAA 
AGAAAAGCAC AAGGAATGGG 



3 ATAAACCTCG 



AATTCCGTGG 



AGCCAGGAGC 
AATAACATGC 
CAGTCAATAA 
TGGAGTCCTT 



AGACCTGTGA 
AATCTTTGCC 
TTTTATCTGG 
CTCAACACTT 



AATGTGATTA 
AAGATATTTT 
TTGCAGAATA 
ACACTGTAAT 



TCCAGCTGCA 
TTCCTATCCA 
TCCCAAAGAA 
ACACAAGTCA 
GTGCAGAATA 
TTTAAAAATC 
TTGACAAATA 
CTGGATTTTG 
AGTGGCTGAA 
ACCAATAAAA 



AAGAGCTATG GTTATGCCAA 
GAACCAGCCT TTTATAATCC : 
TATCAGGAAG 
TAATTCATAA : 
CAAAGATAGC GTATG7GGAA 
AGAAATTGTT TTTTACTGC7 
GCAATTTTTG CATTTGTATA 
GTTTTTA7AA ACTTTTTAAG 

CTAACAATTT TTCTTG 



I 



I 



I 



MATRGLCWPG LAGLARAGPA GKARPRRG3A SLNLAGQKWA AGRWGPTFPS SYACFSADCR 
PRSRPSSDSC SVPMTGARGQ GLEWRSPSP PLFLSCSNST RSLL3PLGHQ SFQFDEDDGD 
GEDEEDVDDE EDVDEDAHDS EAKVASLRGM ELQGCASTQV ESENNQEEQK QVRLPE8RLT 
PWEVWFIGKE KEERDRLQLK ALEELNQQLE KRKEMEEREK RKIIAEEKHK EWVQKKNEQK 
RKEREQKINK EMEEKAAKEL EKEYLQEKAK EKYQEWLKKK NAESCERKKK EKKKNSKLKY 
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;q ID NO: 285 DNA sequence 

lcleic Acid Accession f) ; Eos sequence 

jfling sequence: 1-174S 



I I I I I 

AGCATTATCT CCTTTTGCTG GTGGGCTGCC AAGCCTGGGG TGCAGGGTTG SO 

GCCTACCATG GCTGCCCTAG CGAGTGTACC TGCTCCAGGG CCTCCCAGGT GGAGTGCACC 120 

GGGGCACGCA TTGTGGCGGT GCCCACCCCT CTGCCCTGGA ACGCCATGAG CCTGCAGATC 180 

CTCAACACGC ACATCAC TG A ACTCAATGAG TCCCCGTTCC TCAATATCTC AGCCCTCATC 2 40 

GCCCTGAGGA TTGAGAAGAA TGAGCTGTCG CGCATCACGC CTGGGGCCTT CCGAAACCTG 3 00 

GGCTCGCTGC GCTATCTCAG CCTCGCCAAC AACAAGCTGC AGGTTCTGCC CATCGGCCTC 3 SO 

TTCCAGGGCC TGGACAGCCT TGAGTCTCTC CTTCTGTCCA GTAACCAGCT GTTGCAGATC 42 0 

CAGCCGGCCC ACTTCTCCCA GTGCAGCAAC CTCAAGGAGC TGCAGTTGCA CGGCAACCAC 4 80 

CTGGAATACA TCCCTGACGG AGCCTTCGAC CACCTGGTAG GACTCACGAA GCTCAATCTG 540 

GGCAAGAATA GCCTCACCCA CATCTCACCC AGGGTCTTCC AGCACCTGGG CAATCTCCAG 600 

GTCCTCCGGC TGTATGAGAA CAGGCTCACG GATATCCCCA TGGGCACTTT TGATGGGCTT 6 SO 

GTTAACCTGC AGGAACTGGC TCTACAGCAG AACCAGATTG GACTGCTCTC CCCTGGTCTC 72 0 

TTCCACAACA ACCACAACCT CCAGAGACTC TACCTGTCCA ACAACCACAT CTCCCAGCTG 7 80 

CCACCCAGCA TCTTCATGCA GCTGCCCCAG CTCAACCGTC rA " < CTI rGGGAATTCC 840 

CTGAAGGAGC TCTCTCTGGG GATCTTCGGG CCCATGCCCA ACCTGCGGGA GCTTTGGCTC 900 

TATGACAACC ACATCTCTTC TCTACCCGAC AATGTCTTCA GCAACC7CCG CCAG7TGCAG 960 

GTCCTGATTC TTAGCCGCAA TCAGATCAGC TTCATCTCCC CGGGTGCCTT CAACGGGCTA 1020 

ACGGAGCTTC GGGAGCTGTC CCTCCACACC AACGCACTGC AGGACC7GGA CGGGAATGTC 10 B0 

TTCCGCATGT TGGCCAACCT GCAGAACATC TCCCTGCAGA ACAATCGCCT CAGACAGCTC 1140 

CCAGGGAATA TCTTCGCCAA CGTCAATGGC CTCATGGCCA TCCAGCTGCA GAACAACCAG 12 00 

CTGGAGAACT TGCCCCTCGG CATCTTCGAT CACCTGGGGA AACTGTGTGA GCTGCGGCTG 12S0 

TATGACAATC CCTGGAGGTG TGACTCAGAC ATCCTTCCGC TCCGCAACTG GCTCCTGCTC 1320 

AACCAGCCTA GGTTAGGGAC GGACACTGTA CCTGTGTGTT TCAGCCCAGC CAATGTCCGA 1380 

GGCCAGTCCC TCATTATCAT CAATGTCAAC GTTGCTGTTC CAAGCGTCCA TGTCCCTGAG 1440 

GTGCCTAGTT ACCCAGAAAC ACCATGGTAC CCAGACACAC CCAGTTACCC TGACACCACA 1500 

TCCGTCTCTT CTACCACTGA GCTAACCAGC CCTGTGGAAG ACTACACTGA TCTGACTACC 1560 

ATTCAGGTCA CTGATGACCG CAGCGTTTGG GGCATGACCC AGGCCCAGAG CGGGCTGGCC 1620 

ATTGCCGCCA TTGTAATTGG CATTGTCGCC CTGGCCTGCT CCCTGGCTGC CTGCGTCGGC 1680 

TGTTGCTGCT GCAAGAAGAG GAGCCAAGCT GTCCTGATGC AGATGAAGGC ACCCAATGAG 1740 

TGTTAAAGAG GCAGGCTGGA GCAGGGCTGG GGAATGATGG GACTGGAGGA CCTGGGAATT 1800 

TCATCTTTCT GCCTCCACCC CTGGGTCCAT GGAGCTTTCC CGTGATTGCT CTTTCTGGCC 1860 

CTAGATAAAG GTGTGCCTAC CTCTTCCTGA CTTGCCTGAT TCTCCCGTAG AGAAGCAGGT 192 0 

CGTGCCGGAC CTTCCTACAA TCAGGAAGAT AGATCCAACT GGCCATGGCA AAAGCCCTGG 1980 

GGATTTCCGA TTCATACCCC TGGGCTTCCT TCGAGAGGGC TCTTCCTCCA AATCCTCCCC 2 040 

ACCTGTCCTC CAAGAACAGC CTTOCCTGCG CCCAGGCCCC CTCCGGGCCT CTGTAGACTC 2100 

ACTTAGTCCA CAGCCTGCTC ACTTCGTGGG AATAGTTCTC CGCTGAGATA GCCCCTCTOG 2160 

CCTAAGTATT ATGTAAGTTG ATTTCCCTTC TTTTGTTTCT CTTGTTTGTG CTATGGCTTG 22 2 0 

ACCCAGCATG TCCCCTCAAA TGAAAGTTCT CCCCTTGATT TTCTGCTCCT GAAGGCAGGG 2280 

TGAGTTCTCT CCTCAAAGAA GACTTCAAAC CATTTAACTG GTTTCTTAAG AGCCGTCAAT 2340 

CAGCCTGGTT TTGGGGATCC TATGAAAGAG AGAAGGAAAA TCATGCCGCT CAGTTCCTGG 2 400 

AGACAGAAGA GCCGTCATCA GTGTCTCACT TGTGATTTTT ATCTGGAAAA GGAAGAAACA 2460 

CCCCAGCACA GCAAGCTCAG CCTTTTAGAG AAGGATATTT CCAAACTGCA AACTTTGCTT 2 52 0 

TGAAAAGTTT AGCCCTTTAA GGAATCAAAT CATGTAGAAT TTTGGACTTC TAAAAACATT 2 580 

AAAATCAGCT TATTAATACG GGATAGAGAA AGAAATCTGG TGCCTGGGGG TCCCTGTGTT 2S40 

CACCCCTAGA GTTTGTTTTA AAATTTTTAA TTGAAGCATG TGAAGTGTAC STGCAGAAAA 2 700 

GTGGGAACAT GATAGTGTAT GGCTTGGTGG ATTTTCACAA ACTGAACATA CCTGTGTAAT 2 760 

CAGCATCTAG ACCCAGACCC AGAGCATCAC AAATATCCCC CATCCTGGGC TTT7CCCAGA 2 82 0 

GGAGATGGGG GCTTCTGAAG ATGGACTTAC CTGGGACCTG CCCCCCATGA GCCAGGACGG 2 880 

TCCCCCCACA GTCAGCCTGT GCAAAGGCCC CGTGGCCAGG GGTGGAGGAG AATATGTGGG 2 940 

TGTGGACAGG ATGGGAGACT GTGGCCTGAA CAGGAGATTT TATTATATCT GGAGACCCTG 3 000 

AGAGACCCTG AGACCTGGGG CACCATGGCT GGCCAGGTCA GAAGCATCCT GACTGCAGAG 3 060 

GTCCGTGCAG CCACACCCTC TTCCCTGCCA GCAAGTTGTC TGCGGCTCAT CGGAGGCCCC 3120 

TCCGCCTGGA GCCTTCTATG GACGTGATAT GCCTGTATCT GTTTTTAATT TTCATTCTTC 3180 

ACTTAGGGGA AGTEAAATCG CTCAGAGATG AGATCCTTTA ATTGAAAACG AAGTGTAACG 3240 

GAATCTAGTG TCTTTCTAAT GTGGTAAAAT TCTCCATCAA CATCACAGTC AGCTGGCAGC 3 3 00 

TGAACTTCAG AATCTCACTT ACAGCAGGCG ACACGGGGGT ACACCGATGG GTCACACTGG 3360 

GTCTGGGGGC TCCCTGGAGC TCCTCCTGCG TGTGGTCTGG TTAGGAGTTG AGTTGTTTGC 342 0 

TCCAGGGTTA TTCTCCTCCT CGAGTCACAG TCACACGAAT ACCTGCCTTC TCTGGCTTTC 3 48 0 

CTGCTATACA CATATTCACA TGGCGCTCAA GAAGTTAGGC TCATGGCAAC GTGTGTCTTT 3 54 0 

CTCTGGACAA CTGGCCCAGT TTACAGTGAA ATGGAGAATT TCAGGTCTCC ACGTCTGCCC 3S00 

AGGAAAGAAC TTCAGCTGAC TCCACGGGGA TCTGGAAATC CACGACCAAT CCCGATCGGC 3S6 0 

TCTTATTAGC TCCCCGCTCC ACAAGACACC TGTGCTTTGG AAATCCACCA CCAATCCCGA 3 72 0 

TCGGCTCTTA TTAGCTCCCC GCTCCACAAG ACACCTGTGA TCTGGAAATC TACCACCAAT 3780 

CCCGATCGGC TCTTATTAGC TCCCCGCTCC ACAAGACACC TGTGACATCC TCCAGGGCCA 384 0 

CAGGAGCACG TGCTGACCAG TTTTCCCTTC CAGTTCCTGC ACAAAAAGTG TCCAGAGGGC 3900 

TGTTTGCAAA CACTAGTGCA CTTTGTAGCT TTTCACCCTC TGTCCCAGGG AATCTAGGAG 3960 

AGATGAGGCC CGTCAGAGTC AAGAGATGTC ATCCCCCCAG GGTCTCCAAG GCATTTCCAC 402 0 

ACTATTGGTG GCACCTGGAG GACATGCACC AAGGCTTGCC AGAGCCAACA GGAAGTGAGC 408 0 

CCAGAGCATG GCACATGAGC ATCACCCGCT GATGGTGGCC TGCTGTGCCT GGTGCCAACA 4140 

GGGGCATCCC GGCCCGTACC CCTCCAGACA GGAAGCATGG GTTTGCCCAC AGACCTGTCG 42 00 

GGTGCTCCTG TGAGTGGCCT CCAGATGTCT TTGTGCATAG GCACA2 1 X XAGGGCTG 4260 

GAGGGAGGTG GGAAACC T C A TCATCCGGTG GGCCCTGCCA ATCTTAACCC AGAACCCTTA 432 0 

GGTATTCCTG GCAGTAGCCA TGACATTGGA GCACCTTCCT CTCCAGCCAG AGGCTGACCT 4380 

GAGGGCCACT GTCCTCAGAT GACACCACCC AGGAGCACCC TAGGTGAGGG GTGAGGGCCC 4440 

CCTTATGTGA ACCTCTTGCC TCTTCCTTTC TCCCATCAGA GTGGTTGGAT GGAGCCATTG 45 0 0 

GCCTCCTTTT CTTCAGCGGG CCCTTCAACC TCTCTGCACC ATGTTGTCTG GCTGAGGAGC 4560 

TACTAGAAAA GCTGAGTGGA GTCTCCTTTC CAACAGGATG ATGCATTTGC TCAATTCTCA 462 0 

GGGCTGGAAT GAGCCGGCTG GTCCCCCAGA AAGCTGGAGT GGGGTACAGA GTTCAGTTTT 4680 

CCTCTCTGTT TACAGCTCCT TGACAGTCCC ACGCCCATCT GGAGTGGGAG CTGGGAGTTA 4740 
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GTGTTGGAGA AGAAACAACA AAAGCCAATT AGAACCACTA TTTTTAAAAA GTGCTTAC7G 4800 
TGCACAGATA CTCTTCAAGC ACTGGACGTG GATTCTCTCT CTAGCCCTCA GCACCCCTGC 4 860 
GGTAGGAGTG CCGCCTCTAC CCACTTGTGA TGGGGTACAG AGGCACTTGC TCTTCTGCAT 4 920 
GGTGTTCAAT AGGCTGGGAG TTTTATTTAT CTCTTCAAAC TTTGTACAAG AGCTCATGGC 4 980 
5 TTGTCTTGGG CTTTCGTCAT TAAACCAAAG GAAATGGAAG CCATTCCCCT GTTGCTCTCC S040 
TTAGTCTTGG TCATCAGAAC CTCACTTGGT ACCATATAGA TCAAAAGCTT TGTAACCACA 
GGAAAAAATA AACTCTTCCA TCCCTTAAAG AATAGAATAG T 
TGGGCTGTAT GTATATTGTT CTTCCTCCTT AGAATTTAGA G 

10 

AGTTGGTCGA CAGATGTTAG A 
GCCCCCAGAT CCCACAGTCA G 

GGAAGGAAGC CATGGCTGTG GTTCAGAGAG GGTGGGCTGG C 
C CCCAGGTTTC TTCTTCTCTT AAGGAGAGAT T 
3 CCTTCAAAGC TAGATCATGT TTGCCTTGCT T 
GCCCCAGTGC TTGGCGATGC ATTTACAGAT TTCTAGGCCC T 

AGCCCTGGTG GGCAGGGTTG GGGGGTCTGT CTTCTGCTGG ATGCTGCTTG TAATCCATTT S7S0 
GGTGTACAGA ATCAACAATA AATAATATAC ATGTAT 




25 MPLKHYLLLL VGCQAWGAGL AYHGCPSECT CSRASQVECT GARIVAVPTP LPWNAMSLQI SO 

LNTHITELNE SPPLNISALI ALRIEKNELS RITPGAFRNL GSLRYLSLAN NKLQVLPIGL 120 

FQGLDSLESL LLSSNQLLQI QPAHFSQCSN LKELQLHGNH LEYIPDGAFD HLVGLTKLNL 180 

GKNSLTHISP RVFQHLGNLQ VLRLYENRLT DIPMGTFDGL VNLQELALQQ NQIGLLSPGL 240 

FHNNHNLQRL YLSNNHISQL PPSIFMQLPQ LNRLTLFGNS LKELSLGIFG PMPNLRELWL 300 

30 YDNHISSLPD NVFSNLRQLQ VLILSRNQIS FISPGAFNGL TELRELSLHT NALQDLDGNV 360 

FRMLANLQNI SLQNNRLRQL PGNIFANVNG LMAIQLQNNQ LENLPLGIFD HLGKLCELRL 420 

YDNPWRCDSD ILPLRNWLLL NQPRLGTDTV PVCFSPANVR GQSLIIINVN VAVPSVHVPE 480 

Y PDTPSYPDTT SVSSTTELTS PVEDYTDLTT IQVTDDRSVW GMTQAQSGLA S40 
A LACSLAACVG CCCCKKRSQA VLMQMKAPNE C 



Seq ID NO: 2 87 DNA sequence 
Coding sequence . 1 . . 9S4 



I I I I I I 

ATGTCTTCTG AGCAGAAGAG TCAGCACTGC AAGCCTGAGG AAGGCGTTGA GGCCCAAGAA 
GAGGCCCTGG GCCTGGTGGG TGCACAGGCT CCTACTACTG AGGAGCAGGA GGCTGCTGTC 
TCCTCCTCCT CTCCTCTGGT CCCTGGCACC CTGGAGGAAG TGCCTGCTGC TGAGTCAGCA 
45 GGTCCTCCCC AGAGTCCTCA GGGAGCCTCT GCCTTACCCA CTACCA7CAG CTTCACTTGC 

TGGAGGCAAC CCAATGAGGG TTCCAGCAGC CAAGAAGAGG AGGGGCCAAG CACCTCGCCT 
GACGCAGAGT CCTTGTTCCG AGAAGCACTC AGTAACJ 
CTGCTCCCCA AGTATCGAGC CAAGGAGCTG GTCACAAAGG C 
ATCAAAAATT ACAAGCGCTG CTTTCCTGTG ATCTTCGGCA Anwvw.^wn - 
50 ATGATCTTTG GCATTGACGT GAAGGAAGTG GACCCCGCCA GCAACACCTA C 

ACCTGCCTGG GCCTTTCCTA TGATGGCCTG CTGGGTAATA ATCAGATCTT TCCCAAGACA 
GGCCTTCTGA TAATCGTCCT GGGCACAATT GCAATGGAGG GCGACAGCGC CTCTGAGGAG 
GAAATCTGGG AGGAGCTGGG TGTGATGGGG GTGTATGATG GGAGGGAGCA CACTGTCTAT 
GGGGAGCCCA GGAAACTGCT CACCCAAGAT TGGGTGCAGG AAAACTACCT GGAGTACCGG 
55 CAGGTACCCG GCAGTAATCC TGCGCGCTAT GAGTTCCTGT GGGGTCCAAG GGC7CTGGCT 
GAAACCAGCT ATGTGAAAGT CCTGGAGCAT GTGGTCAGGG TCAATGCAAG AGTTCGCATT 
GCCTACCCAT CCCTGCGTGA AGCAGCTTTG TTAGAGGAGG AAGAGGGAGT CTGA 

60 Seq ID NO: 288 Protein sequence: 
Protein Accession ft: NP_0 023 53.1 

- ] i 1 i 1 r r r 

65 MSSEQKSQHC KPEEGVEAQE EALGLVGAQA PTTEEQEAAV SSSSPLVPGT LEEVPAAESA 
GPPQSPQGAS ALPTTISFTC WRQPNEGSSS QEEEGPSTSP DAESLFREAL SNKVDELAHF 
LLRKYRAKEL VTKAEMLERV IKNYKRCFPV IFGKASESLK MIFGIDVKEV DPASNTYTLV 
TCLGLSYDGL LGNNQIFPKT GLLIIVLGTI AMEGDSASEE EIWEELGWG VYDGREKTVY 
GEPRKLLTQD WVQENYLEYR QVPGSNPARY EFLWGPRALA ETSYVKVLEH WRVNARVRI 
LEEEEGV 

d Accession U #i NM_002362 
ence: 46.. 1344 



I I I I I 

CGGCGGCCGC GCCCTGGTTG GGTCCCCACT GCTCTCGGGG GCGCCATGGA C 
GGCGACCTGA AGCAGGCGCT TCCCTGTGTG GCCGAGTCGC CAACGGTCCA CGTGGAGGTG 
CATCAGCGCG GCAGCAGCAC TGCAAAGAAA GAAGACATAA ACCTGAGTGT TAGAAAGCTA 
CT CAACAGAC ATAATATTGT GTTTGGTGAT TACACATGGA C7GAGTTTGA TGAACCTTTT 
TTGACCAGAA ATGTGCAGTC TGTGTCTATT ATTGACACAG AATTAAAGGT TAAAGACTCA 
CAGCCCATCG ATTTGAGTGC ATGCACTGTT GCACTTCACA T7TTCCAGCT GAATGAAGAT 
GGCCCCAGCA GTGAAAAT CT GGAGGAAGAG ACAGAAAACA TAATTGCAGC AAATCACTGG 
GTTCTACCTG CAGCTGAATT CCATGGGCTT TGGGACAGCT TGGTATACGA TGTGGAAGTC 
AAATCCCATC TCCTCGATTA TGTGATGACA ACTTTACTGT TTTCAGACAA GAACGTCAAC 
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AGCAACCTCA TCACCTGGAA 
ACATCCCTGT GTAAAGCGTT 
TATGGCCAAT TAATTGAAAT 
GGCAAGCTGG TAACCAAGAT 
CTGGTGTTCG TGCTGATTGA 
GCGGGCACCG AGCCATCAGA 
CAGATTAAAA GGCATTCCAA 
GACGTGGCCT TCGTGGACAG 
GCCATCITCA AAATCTACCT 
CCTCGCCAGC AGCTGCTGAC 
GTGTCAAAAT TGAGCCTTCT 
CGGGTCCTGA 
ACCATAGAGG 
AAGAAGCTTG CAGCTTACAT 
AACACACAAC CAGTAAGTGA 
AAACCAAACG TTACTTAGAC 
AAGTGTATTC 
TCACTGTTTG 
GTTTGTTCCC AGCCCACCCC 
AGCAAAAAAG GAAGATTAAT 
AAAGAAGCAT ATAATCATAG 
TTGCTTTCTG ATATCAGCTC 
GATGTAAAAA AGCCTATTTC 
TCTTTCATAA TAAATAATCA 
GTTCCAGGGA AACACATGCT 
TGGGATGTTT CTGCCCACGG 
TTCACATTAA TATAATATAA 
ATGTTTTGCT TTTATCTCAC 



GTTTCAGAAG 
TGAGGTGGAG 
TGCCATCCGC 



CTGCTCCACG GTCCTCCTGG CACTGGAAAA 
TTGACAATTA GACTTTCAAG CAGGTACCGA 
AGCCTCTTTT CTAAGTGG7T TTCGGAAAGT 



PCT/US02/12476 



GGCTGACATC 
CCTCCGAGAG 
CTTTCTGGCT 



AGTCTCACAG CCGCCCGAAA TGCCTGCAGG 
GTGGTCAATG CTGTCTTGAC CCAAATTGAT 
CTGACCACTT CTAACATCAC CGAGAAGATC 
AAGCAGTACA TTGGGCCACC CTCTGCAGCA 
GAAGAACTGA TGAAGTGTCA GATCATATAC 



SGC 



; TCCCAGGGAA 



CTGATCCTGG 
TGCAAGCTAG 



TTCAGATTGT TTGTCTCCTT GTGAAGAACC 
CAGTGGATGG GATGCATAAT GCCAGCAAGT 
GCAGGTGTTA TAGAAGCCAG AAGAGAAACT 
CATTAAAAAT 
GTTTGATTTA 
TACATTATAC 
AGACATGGTC 
GGACATCCCT 
TTTTGTTTGT 
AATAAATAGG 
AGTAAAATAA 



r TCCCATGGAG 



TTGTTAAAAG 
CTTGTCATTT 
ATCGAAACCT 



GTGCAAAAAT 
CAACTGAGAA 
CCATTTGCAG 
TGTAACGCGG 
GCAATAACGT 
TCAGTTACTG 
ATATAATTAA 



GTTTTCAAGA 



GAAAAGTGCA 
TATGGGCGCC 
TATCACATTT 
GTCTCTTTCT 



GGTAAAGTGT 
GACTCTGAGT 
CCTGCATTGC : 
CTAATGAGGA 
GCCGAATGTT 



I 



MDEAVGDLKQ ALPCVAESPT VHVEVHQRGS STAKKEDINL SVRKLLNRHN IVPGDYTWTE 
FDEPFLTRNV QSVSIIDTEL KVKDSQPIDL SACTVALHIF QLNEDGPSSE NLEEETENII 
AANHWVLPAA EFHGLWDSLV YDVEVKSHLL DYVMTTLLPS DKNVMSNLIT WMRWLLHGP 
PGTGKTSLCK ALAQKLTIRL SSRYRYGQLI EINSHSLFSK WPSESGKLVT KMFQKIQDLI 
DDKDALVFVL IDEVESLTAA RNACRAGTEP SDAIRWNAV LTQIDQIKRH SNWILTTSN 
ITEKIDVAFV DRADIKQYIG PPSAAAIFKI YLSCLEELMK CQIIYPRQQL LTLRELEMIG 
FIEHNVSKLS LLLNDISRKS EGLSGRVLRK LPFLAHALYV QAPTVTIEGF LQALSLAVDK 
QFEERKKLAA YI 

Seq ID NO i 291 DNA sequence 

Nucleic Acid Accession #: NM_002658.1 

Coding sequence: 77-1372 



GTCCCCGCAG C 



GAGCGACTCC 



GAAATTCGGA 
TCACTTTTAC 
CTCTGCCACT 



GCCACCATGA 
TGTGTGTCCA 



AAAGCCCTCC 
CCGCTTTAAG 
CATCTACAGG 
CCCTTGCTGG 
CATCGTCTAC 
GGTGGAAAAC 
CATTGCCTTG 
ACAGACCATC 
CACTGGCTTT 
TGTTGTGAAG 
CACCACCAAA 
CTCAGGGGGA 
CTGGGGCCGT 
CTTACCCTGG 
AGGGAGGAAA 
TCCATCAGCT 
CACCACCAGG 
CAGACCCTCT 

GGCTCGAAGG 
AATGAATAAT 
AATGTGGGAG 



GTCCTTCAGC 
CATAATTACT 
CTAAAGCCGC 
TCTCCTCCAG 
ATTATTGGGG 
AGGCACCGGG 



I 

~ CCCTCCTGCC 
GAGCCCTGCT 
ATGAACTTCA 
ACAAGTACIT 
GTGAAATAGA 
CCAGCACTGA 
AAACGTACCA 
GCAGGAACCC 
TTGTCCAAGA 
AAGAATTAAA 
GAGAATTCAC 



GGCGCGCCTG 



I I 
GAGGCCGCCG C 
CTTCTCTGCG 

CACTGGTGCA ACTGCCCAAA 
ACCTGCTATG AGGGGAATGG 
CGGCCCTGCC TGCCCTGGAA 
TTCAGCTGGG 
AGGCGACCCT GGTGCTATGT 
CA7GACTGCG CAGATGGAAA 



CTCCAACATT C 
TAAGTCAAAA A 
CACCATGGGC G 
TGCCCACAGA T 
AGACAACCGG A 
GTGCATGGTG C. 
ATTTCAGTGT GGCCAAAAGA CTCTGAGGCC 6 0 0 
CACCATCGAG AACCAGCCCT GGTTTGCGGC 660 
CACCTACGTG T 



CTGAAGATCC 
TGCCTGCCCT 
GGAAAAGAGA 
CTGATTTCCC 
ATGCTATGTG 
CCCCTCGTCT 
GGATGTGCCC 



CGGGCACCAC 



GTGAACGACA 
GGCCAGGATG 
TGGACTGAAG 
GAGAGCCAGC 



GTTCCAAGGA 
CGATGTATAA 
ATTCTACCGA 
ACCGGGAGTG 
CTGCTGACCC 
GTTCCCTCCA 
TGAAGGACAA 
ACACCAAGGA 
CCGCTTTCTT 
ACTGGGAAGA 



G CTTCATTGAT T. 
A CTCCAACACG C 
A CAGCGCTGAC A 



GCCAGGCGTC 



CAGCGGTTTG 
TAAGTGTGAG 



GAGGGGTGGT 
CCTG CAGGAG 
TCCCCCGACC 
GGAAGTGTAA 
GGGAGCAGAG 



TTTGGCACAA GCTGTGAGAT 
CCGGAGCAGC TGAAAATGAC 
CACTACTACG GCTCTGAAGT 
CCAATGGAAA ACAGATTCCT GCCAGGGAGA 
ACTTTGACTG 
TACACGAGAG 
CTGGCCCTCT 
ATTTTTGCAG 
ACAGATGGAT 
AGGCCTGGGT 

CAGGGCATCT 
TGTGAGGCCC 
TCTCTTGAGG 
C7TCAGGGCA 
TTTGCACACT 



GCTGGTTGTC 
TAGGCTCTGC 
CCTCACGGAT 
CCTGACTCAA 
TTAAAAAGGG 
GGTGGGCATT 



ACACTAACGA 



ACCTGTGACC 
ATCCCTTCCT 
ACACTGAATA 



TTTATATTTC 



CAGTTTCACT 
TTCATCCAAT 
ACTATTTTTA 



TGGTCTTTCT G< 
TG CCTGGGAA T 
TTCACATAGA T 
CCTCACTGGG T 
TTTATATTTT T 



TAGAGT CATC 
TTGCCTGTGG 
GCTGGCTGCC 
ACCAGCAACT 
CCTGTGCATG 
ATGGTTGAGA 
GAGCTTAGCC 
GGGCTCTGAT 
TGTTGTGTGG 
ATATTTCCTT 
TAGGTCAC7C 
CTGCAGCA7G 
TTGGCCAGTT 
ACCACTCCTT 
AATAAAAGTG 
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I I I 

L CVLWSDSKG SNELHQVPSN CDCLNGGTCV SNKYFSNIHW CNCPKKFGGQ 
ilGHFYRG KASTDTMGRP CLPWNSATVL QQTYHAHRSD ALQLGLGKHN 

YCRNPDNRRR PWCYVQVGLK PLVQECMVHD CADGKKPSdl-' PEELEFQCGQ KTLRPRFKII 

GGEFTTIENQ PWFAAIYRRH RGGSVTYVCG GSLISPCWVI SATHCFIDYP KKEDYIVYLG 

RSRLNSNTQG EMKFEVENLI LHKDYSADTL AHHNDIALLK IRSKEGRCAQ PSRTIQTICL 

PSMYNDPQFG TSCEITGFGK ENSTDYLYPE QLKMTWKLI SHRECQQPHY YGSEVTTKML 

CAADPQWKTD SCQGDSGGPL VCSLQGRMTL TGIVSWGRGC ALKDKPGVY7 RVSHFLPWIR 



Seq ID NO: 293 DNA sequence 
Nucleic Acid Accession #: NM_001498 
Coding sequence: 93 .. 2006 



I I I I I I 

GGCACGAGGC TGAGTGTCCG TCTCGCGCCC GGAAGCGGGC GACCGCCGTC AGCCCGGAGG 60 

25 AGGAGGAGGA GGAGGAGGAG GAGGGGGCGG CCATGGGGCT GCTGTCCCAG GGCTCGCCGC 12 0 

TGAGCTGGGA GGAAACCAAG CGCCATGCCG ACCACGTGCG GCGGCACGGG ATCCTCCAGT 180 

TCCTGCACAT CTACCACGCC GTCAAGGACC GGCACAAGGA CGTTCTCAAG TGGGGCGATG 240 

AGGTGGAATA CATGTTGGTA TCTTTTGATC ATGAAAATAA AAAAGTCCGG TTGGTCCTGT 300 

CTGGGGAGAA AGTTCTTGAA ACTCTGCAAG AGAAGGGGGA AAGGACAAAC CCAAACCATC 360 

30 CTACCCTTTG GAGACCAGAG TATGGGAGTT ACATGATTGA AGGGACACCA GGACAGCCCT 42 0 

ACGGAGGAAC AATGTCCGAG TTCAATACAG TTGAGGCCAA CATGCGAAAA CGCCGGAAGG 480 

AGGCTACTTC TATATTAGAA GAAAATCAGG CTCTTTGCAC AATAACTTCA TTTCCCAGAT 54 0 

TAGGCTGTCC TGGGTTCACA CTGCCCGAGG TCAAACCCAA CCCAGTGGAA GGAGGAGCTT 60 0 

CCAAGTCCCT CTTCTTTCCA GATGAAGCAA TAAACAAGCA CCCTCGCTTC AGTACCTTAA 660 

35 CAAGAAATAT CCGACATAGG AGAGGAGAAA AGGTTGTCAT CAATGTACCA ATATTTAAGG 72 0 

ACAAGAATAC ACCATCTCCA TTTATAGAAA CATTTACTGA GGATGATGAA GCTTCAAGGG 780 

CTTCTAAGCC GGATCATATT TACATGGATG CCATGGGATT TGGAATGGGC AATTGCTGTC 84 0 

TCCAGGTGAC ATTCCAAGCC TGCAGTATAT CTGAGGCCAG ATACCTTTAT GATCAGTTGG 900 

CTACTATCTG TCCAATTGTT ATGGCTTTGA GTGCTGCATC TCCCTTTTAC CGAGGCTATG 960 

40 TGT CAGACAT TGATTGTCGC TGGGGAGTGA TTTCTGCATC TGTAGATGAT AGAACTCGGG 102 0 

AGGAGCGAGG ACTGGAGCCA TTGAAGAACA ATAACTATAG GATCAGTAAA TCCCGATATG 108 0 

ACTCAATAGA CAGCTATTTA TCTAAGTGTG GTGAGAAATA TAATGACATC GACTTGACGA 114 0 

TACATAAAGA CATCTACGAA CAGCTGTTGC AGGAAGGCAT TGATCATCTC CTGGCCCAGC 1200 

ATGTTGCTCA TCTCTTTATT AGAGACCCAC TGACACTGTT TGAAGAGAAA ATACACCTGG 12 6 0 

45 ATGATGCTAA TGAGTCTGAC CATTTTGAGA ATATTCAGTC CACAAATTGG CAGACAATGA 132 0 

GATTTAAGCC CCCTCCTCCA AACTCAGACA TTGGATGGAG AGTAGAATTT CGACCCATGG 13 8 0 

AGGTGCAATT AACAGACTTT GAGAACTCTG CCTATGTGGT GTTTGTGG7A CTGCTCACCA 144 0 

GAGTGATCCT TTCCTACAAA TTGGATTTTC TCATTCCACT GTCAAAGGTT GATGAGAACA 15 0 0 

TGAAGGTAGC ACAGAAAAGA GATGCTGTCT TGCAGGGAAT GTTTTATTTC AGGAAAGATA 15 6 0 

50 TTTGCAAAGG TGGCAATGCA GTGGTGGATG CTTGTGGCAA GGCCCAGAAC AGCACGGAGC 162 0 

TCGCTGCAGA GGAGTACACC CT CATGAGC A TAGACACCAT CATCAATGGG AAGGAAGGTG 1680 

TGTTTCCTGG ACTGATCCCA ATTCTGAACT CTTACCTTGA AAACATGGAA GTGGATGTGG 174 0 

ACACCAGATG TAGTATTCTG AACTACCTAA AGCTAATTAA GAAGAGAGCA TCTGGAGAAC 18 0 0 

TAATGACAGT TGCCAGATGG ATGAGGGAGT TTATCGCAAA CCATCCTGAC TACAAGCAAG 1860 

55 ACAGTGTCAT AACTGATGAA ATGAATTATA GCCTTATTTT GAAGTGTAAC CAAATTGCAA 192 0 
ATGAATTATG TGAATGCCCA GAGTTACTTG GATCAGCATT TAGGAAAGTA A 
GAAGTAAAAC TGACTCATCC AACTAGACAT TCTACAGAAA GAAAAATGCA T 
ACTGGCTACA GTACCATGCC TCTCAGCCCG TGTGTATAAT ATGAAGACCA A 

CTGTACTGTT TTCTGGGCCA GTGAGCCAGA AATTGATTAA GGCTTTCTTT GO„ ™« 

60 TCTAGAGTTT ATACAGTGTA CATGTACATA GTAAAGTATT TTTGATTAAC AATGTATTTT 

AATAACATAT CTAAAGTCAT CATGAACTGG CTTGTACATT TTTAAATTCT TACTCTGGAG aw 

CAACCTACTG TCTAAGCAGT TTTGTAAATG TACTGGTAAT TGTACAATAC TTGCATTCCA 234 0 

GAGTTAAAAT GTTTACTGTA AATTTTTGTT CTTTTAAAGA CTACCTGGGA CCTGATTTAT 24 00 

TGAAATTTTT CTCTTTAAAA ACATTTTCTC TCGTTAATTT TCCTTTGTCA TTTCCTTTGT 24 60 

65 TGTCTACATT AAATCACTTG AATCCATTGA AAGTGCTTCA AGGGTAATCT TGGGTTTCTA 252 0 

GCACCTTATC TATCATGTTT CTTTTGCAAT TGGAATAATC ACTTGGTCAC CTTGCCCCAA 25 8 0 
GCTTTCCCCT CTGAATAAAT ACCCATTGAA CTCTGAAAAA AAAAAAAAAA AAAA 



ain sequence: 



I I I I I I 

MGLLSQGSPL SWEETKRHAD HVRRHGILQF LHIYHAVKDR HKDVLKWGDE VEYMLVSFDH 

ENKKVRLVLS GEKVLETLQE KGERTNPNHP TLWRPEYGSY MIEGTPGQPY GGTMSEFNTV 

EANMRKRRKE ATSILEENQA LCTITSFPRL GCPGFTLPEV KPNPVEGGAS KSLFFPDEAI 

NKHPRFSTLT RNIRHRRGEK WINVPIFKD KNTPSPFIET FTEDDEASRA SKPDHIYMDA 

MGFGMGNCCL QVTFQACSIS EARYLYDQLA TICPIVMALS AASPFYRGYV SDIDCRWGVI 

SASVDDRTRE ERGLEPLKNN NYRISKSRYD SIDSYLSKCG EKYNDIDLTI DKSIYEQLLQ 

EGIDHLLAQH VAHLFIRDPL TLFEEKIHLD DANESDHFEN IQSTNKQTMR FKPPPPNSDI 

GWRVEFRPME VQLTDFENSA YWFWLLTR VILSYKLDFL IPLSKVDENK KVAQKRDAVL 

QGMFYFRKDI CKGGNAWDG CGKAQNSTEL AAEEYTLMSI DTIINGKEGV FPGLIPILNS 

YLENMEVDVD TRCSILNYLK LIKKRASGEL MTVARWMREF IANHPDYKQD SVITDEMNYS 
LILKCNQIAN ELCECPELLG SAFRKVKYSG SKTDSSN 
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I I I 

C TGGGGCAGGC ACGCTGTGGC TGGCTACTTC CCTTCCTCCC A 1 

GGCCAAACGG GATCGGTGCT TCTGGTGAGA CGCCTCCCCA TGCACATCAC TCCCAGGTGC 120 

CCTAGGGGGC ACATTTCCCA CAACTCCCAG AGGGCAGGTT TCTAGAAAGT GCCACCAGTG ISO 

GGGAGGCGCC ACAACTTCAC TGCCATTTTG TGAGGTGCCG CCGTCTCTCC TCCAGCAAGG 2 40 

GAAACAATGA CCGATAAAAC AGAGAAGGTG GCTGTAGATC CTGAAACTGT GTTTAAACGT 300 

CCCAGGGAAT GTGACAGTCC TTCGTATCAG AAAAGGCAGA GGATGGCCCT GTTGGCAAGG 3 60 

3 CAGGAGACAG CCTTATTGCA GGCTCTGCCA TGTCCAAAGA AAAGAAGCTT 420 

C ATGCTATTCC ACCCAGCCAA TTGGATTCTC AGATTGATGA CTTCACTGGT 4 80 

TTCAGCAAAG ATAGGATGAT GCAGAAACCT GGTAGCAATG CACCTGTGGG AGGAAACGTT 540 

T TCTCTGGAGA TGACCTAGAA TGCAGAGAAA CAGCCTCCTC TCCCAAAAGC 600 

A TTAATGCTGA TATAAAACGT AAATTAGTGA AGGAACTCCG ATGCGTTGGA 6 60 

CAAAAATATG AAAAAATCTT CGAAATGCTT GAAGGAGTGC AAGGACCTAC TGCAGTCAGG 72 0 

AAGCGATTTT TTGAATCCAT CATCAAGGAA GCAGCAAGAT GTATGAGACG AGACTTTGTT 780 

AAGCACCTTA AGAAGAAACT GAAACGTATG ATTTGAGAAT ACTTGTCCCT GGAGGATTAT 840 

CACACCCCAA ATGCATAATC TCGTTAATGA TTGAGGAGAG AAAAGGATCA GATTGCTGTT 900 

TTCTACAATG GAGCAGGATA TTGCTGAAGT CTCCTGGCAT ATGTTACCGA ATCAAATAGC 960 

CTTCCAGAGG CTAAGAAATT TCTGTTAGTA AAAGATGTTC TTTTTCCCAA AG CATTTT AT 102 0 

TTGAAAGGAT AACTTGTGTT TTGGTTATTT TGTATTCCCA CCTGTGCTGG TAGATATTAT 1030 
TAACCCATTA GGTAAATACT ATTACAGTCG TGGTTTCTGC A 



MTDKTEKVAV DPETVFKHPR ECDSPSYQKR QRMALLARKQ GAGDSLIAGS AMSKEKKLMT 
GHAIPPSQLD SQIDDPTGFS KDRMMQKPGS NAPVGGNVTS SPSGDDLECR ETASSPKSQR 
EINADIKRKL VKELRCVGQK YEKIFEMLEG VQGPTAVRKR FPESIIKEAA RCMRRDFVKH 



I I I I I I 

AGTGTTCGGC TGGGGCAGGC ACGCTGTGGC TGGCTACTTC CCTTCCTCCC ATCCCCCTTG 

GGCCAAACGG GATCGGTGCT TCTGGTGAGA CGCCTCCCCA TGCACATCAC TCCCAGGTGC 

CCTAGGGGGC ACATTTCCCA CAACTCCCAG AGGGCAGGTT TCTAGAAAGT GCCACCAGTG 

GGGAGGCGCC ACAACTTCAC TGCCATTTTG TGAGGTGCCG CCGTCTCTCC TCCAGCAAGG 

GAAACAATGA CCGATAAAAC AGAGAAGGTG GCTGTAGATC CTGAAACTGT GTTTAAACGT 

CCCAGGGAAT GTGACAGTCC TTCGTATCAG AAAAGGCAGA GGATGGCCCT GTTGGCAAGG 

AAACAAGGAG CAGGAGACAG CCTTATTGCA GGCTCTGCCA TGTCCAAAGA AAAGAAGCTT 

ATGACAGGAC ATGCTATTCC ACCCAGCCAA TTGGATTCTC AGATTGATGA CTTCACTGGT 

TTCAGCAAAG ATAGGATGAT GCAGAAACCT GGTAGCAATG CACCTGTGGG AGGAAACGTT 

ACCAGCAGTT TCTCTGGAGA TGACCTAGAA TGCAGAGAAA CAGCCTCCTC TCCCAAAAGC 

CAACAAGAAA TTAATGCTGA TATAAAACGT AAATTAGTGA AGGAACTCCG ATGCGTTGGA 

CAAAAATATG AAAAAATCTT CGAAATGCTT GAAGGAGTGC AAGGACCTAC TGCAGTCAGG 

AAACGATTTT TTGAATCCAT CATCAAGGAA GCAGCAAGAT GTATGAGACG AGACTTTGTT 

AAGCACCTTA AGAAGAAACT GAAACGTATG ATTTGAGAAT ACTTGTCCCT GGAGGATTAT 

CACACCCCAA ATGCATAATC TCATTAATGA TTGAGGAGAG AAAAGGATCA GATTGCTGTT 

TTCTACAATG GAGCAGGATA TTGCTGAAGT CTCCTGGCAT ATGTTACCGA ATCAACTGGC 

CTTCCAGAGG CTAAGAAATT TCTGTTAGTA AAAGATGTTC TTTTTCCCAA A' 

TTGAAAGGAT AACTTGTGTT TTGGTTATTT TGTATTCCCA CCTGTGCTGG T 

TAACCCATTA GGTAAATACT ATTACAGTCG TGGTTTCTGC A 

Seq ID NO* 2 98 



MTDKTEKVAV DPETVFKRPR ECDSPSYQKR QRMALLARKQ GAGDSLIAGS A 

GHAIPPSQLD SQIDDFTGFS KDRMMQKPGS NAPVGGNVTS SFSGDDLECR ETASSPKSQQ 

EINADIKRKL VKELRCVGQK YEKIFEMLEG VQGPTAVRKR FFESIIKEAA RCMRRDFVKH 
LKKKLKRMI 



I I I I I I 

AGTGTTCGGC TGGGGCAGGC ACGCTGTGGC TGGCTACTTC CCTTC^ C ATCCi J 
GGCCAAACGG GATCGGTGCT TCTGGTGAGA CGCCTCCCCA TGCACATCAC TCCCAGGTGC 
CCTAGGGGGC ACATTTCCCA CAACTCCCAG AGGGCAGGTT TCTAGAAAGT GCCACCAGTG 
GGGAGGCGCC ACAACTTCAC TGCCATTTTG TGAGGTGCCG CCGTCTCTCC TCCAGCAAGG 
GAAACAATGA CCGATAAAAC AGAGAAGGTG GCTGTAGATC CTGAAACTGT GTTTAAACGT 
CCCAGGGAAT GTGACAGTCC TTCGTATCAG AAAAGGCAGA GGATGGCCCT GTTGGCAAGG 
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TCAGCAAAGA TAGGATGATG CAGAAACCTG GTAGCAATGC ACCTGTGGGA GGAAACGTTA 540 

CCAGCAGTTT CTCTGGAGAT GACCTAGAAT GCAGAGAAAC AGCCTCCTCT CCCAAAAGCC 600 

AACAAGAAAT TAATGCTGAT ATAAAACGTA AATTAGTGAA GGAACTCCGA TGCGTTGGAC 660 

AAAAATATGA AAAAATCTTC GAAATGCTTG AAGGAGTGCA AGGACCTACT GCAGTCAGGA 72 0 

AACGATTTTT TGAATCCATC ATCAAGGAAG CAGCAAGATG TATGAGACGA GACTTTGTTA 780 

AGCACCTTAA GAAGAAACTG AAACGTATGA TTTGAGAATA CTTGTCCCTG GAGGATTATC 840 

ACACCCCAAA TGCATAATCT CATTAATGAT TGAGGAGAGA AAAGGATCAG ATTGCTGTTT 900 



MTDKTEKVAV DPETVFKRPR ECDSPSYQKR QRMALLARKQ GAGDSLIAGS AMSKAKKLMT 
GHAIPPSQLD SQIDDFTGFS KDRMMQKPGS NAPVGGNVTS SFSGDDLECR ETASSPKSQQ 
EINADIKRKL VKELRCVGQK YEKIFEMLEG VQGPTAVRKR FFESI IKEAA RCMRRDFVKH 



I I I I I I 

AGTGTTCGGC TGGGGCAGGC ACGCTGTGGC TGGCTACTTC CCTTCCTCCC ATCCCCCTTG 60 

GGCCAAACGG GATCGGTGCT TCTGGTGAGA CGCCTCCCCA TGCACATCAC TCCCAGGTGC 120 

CCTAGGGGGC AGATTTCCCA CAACTCCCAG AGGGCAGGTT TCTAGAAAGT GCCACCAGTG 180 

GGGAGGCGCC ACAACTTCAC TGCCATTTTG TGAGGTGCCG CCGTCTCTCC TCCAGCAAGG 24 0 

GAAACAATGA CCGATAAAAC AGAGAAGGTG GCTGTAGATC CTGAAACTGT GTTTAAACGT 300 

CCCAGGGAAT GTGACAGTCC TTCGTATCAG AAAAGGCAGA GGATGGCCCT GTTGGCAAGG 360 

AAACAAGGAG CAGGAGACAG CCTTATTGCA GGCTCTGCCA TGTCCAAAGA AAAGAGCTTA 420 

TGACAGGACA TGCTATTCCA CCCAGCCAAT TGGATTCTCA GATTGATGAC TTCACTGGTT 480 

TCAGCAAAGA TGGGATGATG CAGAAACCTG GTAGCAATGC ACCTGTGGGA GGAAATGTTA 54 0 

CCAGCAATTT CTCTGGAGAT GACCTAGAAT GCAGAGGAAT AGCCTCCTCT CCCAAAAGCC 60 0 

AACAAGAAAT TAATGCTGAT ATAAAATGTC AAGTAGTGAA GGAAATCCGA TGCCTTGGAC 660 

AATATGAAAA AATCTTCGAA ATGCTTGAAG GAGTGCAAGG ACCTACTGCA GTCAGGAAAC 72 0 

GATTTTTTGA ATCCATCATC AAGGAAGCAG CAAGATGTAT GAGACGAGAC TTTGTTAAGC 78 0 

ACCTTAAGAA GAAACTGAAA CGTATGATTT GAGAATACTT GTCCCTGGAG GATTATCACA 840 

CCCCAAATGC ATAATCTCAT TAATGATTGA GGAGAGAAAA GGATCAGATT GCTGTTTTCT 90 0 

ACAATGGAGC AGGATATTGC TGAAGTCTCC TGGCATATGT TACCGAATCA ACTGGCCTTC 96 0 

CAGAGGCTAA GAAATTTCTG TTAGTAAAAG ATGTTCTTTT TCCCAAAGCG TTTTATTTGA 102 0 

AAGGATAACT TGTGTTTTGG TTATTTTGTA TTCCCACCTG TGCTGGTAGA TATTATTAAC 108 0 
CCATTAGGTA AATACTATTA CAGTCGTGGT TTCTGCA 



MTDKTEKVAV DPETVFKRPR ECDSPSYQKR QRMALLARKQ GAGDSLIAGS AMSKEKKLMT 

GHAIPPSQLD SQIDDFTGFS KDGMMQKPGS NAPVGGNVTS NFSGDDLECR GIASSPKSQQ 

EINADIKCQV VKEIRCLGQY EKIFEMLEGV QGPTAVRKRF F ~~ " 
KKKLKRMI 



Coding sequence: 2 47-815 

1 U 21 » 41 51 

AGTGTTCGGC TGGGACAGGC ACGCTGTGGC TGGCTACTTC CCTTCCTTCC ATCCCCCTTG 60 

GGCCAAACAG GATCGGTGCT TCTGGTGAGA CGTCTCCCCA TGCACATCAC TCCCAGATGC 120 

CCTAGGGGGC ACATTTCCCA CAACTCCCAG AGGGCAGGTT TCTAGAAAGT GCCACCAGTG 180 

GGGAGGCGCC ACAACTTCAC TGCCATTTTG TGAGGTGCCG CCGTCTCTCC TCCAGCAAGG 240 

GAAACAATGA CCGATAAAAC AGAGAAGGTG GCTGTAGATC CTGAAACTGT GTTTAAACGT 30 0 

CCCAGGGAAT GTGACAGTCC TTCGTATCAG AAAAGGCAGA GGATGGCCCT GTTGGCAAGG 360 

AAACAAGGAG CAGGAGACAG CCTTATTGCA GGCTCTGCCA TGT CCAAAGC AAAGAGCTTA 42 0 

TGACAGGACA TGCTATTCCA CCCAGCCAAT TGGATTCTCA GATTGATGAC TTCACTGGTT 48 0 

TCAGCAAAGA TAGGATGATG CAGAAACCTG GTAGCAATGC ACCTGTGGGA GGAAACGTTA 540 

CCAGCAGTTT CTCTGGAGAT GACCTAGAAT GCAGAGAAAC AGCCTCCTCT CCCAAAAGCC 600 

AACAAGAAAT TAATGCTGAT ATAAAACGTA AATTAGTGAA GGAACTCCGA TGCGTTGGAC 660 

AAAAATATGA AAAAATCTTC GAAATGCTTG AAGGAGTGCA AGGACCTACT GCAGTCAGGA 72 0 

AACGATTTTT TGAATCCATC ATCAAGGAAG CAGCAAGATG TATGAGACGA GACTTTGTTA 780 

AGCACCTTAA GAAGAAACTG AAACGTATGA TTTGAGAATA CTTGTCCCTG GAGGATTATC 840 

ACACCCCAAA TGCATAATCT CGTTAATGAT TGAGGAGAGA AAAGGATCAG ATTGCTGTTT 90 0 

TCTACAATGG AGCAGGATAT TGCTGAAGTC TCCTGGCATA TGTTACCGAA TCAACTGGCC 960 

TTCCAGAGGC TAAGAAATTT CTGTTAGTAA AAGATGTTCT TTTTCCCAAA GCGTTTTATT 102 0 

TGAAAGGATA ACTTGTGTTT TGGTTATTTT GTATTCCCAC CTGTGCTGGT AGATATTATT 1080 
AACCCATTAG GTAAATACTA TTACAGTCGT GGTTTCTGCA 
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11 21 31 41 51 

I I I I I 

I ECDSPSYQKR QRMALLARKQ GAGDSLIAGS AMSKAKKLMT SO 



I I I I I 

CGTGGAGGCA GCTAGCGCGA GGCTGGGGAG CGCTGAGCCG CGCGTCGTGC CCTGCGCTGC 60 

15 CCAGACTAGC GAACAATACA GTCAQGATGG CTAAAGGTGA CCCCAAGAAA CCAAAGGGCA 12 0 

AGATGTCCGC TTATGCCTTC TTTGTGCAGA CATGCAGAGA AGAACATAAG AAGAAAAACC 180 

CAGAGGTCCC TGTCAATTTT GCGGAATTTT CCAAGAAG7G CTCTGAGAGG 7GGAAGACGA 240 

TGTCCGGGAA AGAGAAATCT AAATTTGATG AAATGGCAAA GGCAGATAAA GTGCGCTATG 300 

ATCGGGAAAT GAAGGATTAT GGACCAGCTA AGGGAGGCAA GAAGAAGAAG GATCCTAATG 3 60 

20 CTCCCAAAAG GCCACCGTCT GGATTCTTCC TGTTCTGT7C AGAATTCCGC CCCAAGATCA 420 

AATCCACAAA CCCCGGCATC TCTATTGGAG ACGTGGCAAA AAAGCTGGGT GAGATGTGGA 4B0 

ATAATTTAAA TGACAGTGAA AAGCAGCCTT ACATCACTAA GGCGGCAAAG CTGAAGGAGA 540 

AGTATGAGAA GGATGTTGCT GACTATAAGT CGAAAGGAAA GTTTGATGGT GCAAAGGGTC 600 

CTGCTAAAGT TGCCCGGAAA AAGGTGGAAG AGGAAGATGA AGAAGAGGAG GAGGAAGAAG 660 

25 AGGAGGAGGA GGAGGAGGAG GATGAATAAA GAAACTGTTT ATCTGTCTCC 7TGTGAATAC 720 

TTAGAGTAGG GGAGCGCCGT AATTGACACA TCTCTTATTT GAGAAGTGTC TGTTGCCCTC 780 

ATTAGGTTTA ATTACAAAAT TTGATCACGA TCATATTGTA GTCTCTCAAA GTGCTCTAGA 840 

AATTGTCAGT GGTTTACATG AAGTGGCCAT GGGTGTCTGG AGCACCCTGA AACTGTATCA 900 

AAGTTGTACA TATTTCCAAA CATTTTTAAA ATGAAAAGGC ACTCTCGTGT TCTCCTCACT 960 

30 CTGTGCACTT TGCTGTTGGT GTGACAAGGC ATTTAAAGAT GTTTCTGGCA TTTTCTTTTT 1020 

ATTTGTAAGG TGGTGGTAAC TATGGTTATT GGCTAGAAAT CCTGAGTTTT CAACTGTATA 1080 

TATCTATAGT TTGTAAAAAG AACAAAACAA CCGAGACAAA CCCTTGATGC TCCTTGCTCG 1140 

GCGTTGAGGC TGTGGGGAAG ATGCCTTTTG GGAGAGGCTG TAGCTCAGGG CGTGCACTGT 12 00 

GAGGCTGGAC CTGTTGACTC TGCAGGGGGC ATCCATTTAG CTTCAGGTTG TCTTGTTTCT 12 60 

35 GTATATAGTG ACATAGCATT CTGCTGCCAT CTTAGCTGTG GACAAAGGQG GGTCAGCTGG 1320 

CATGAGAATA TTTTTTTTTT TAAGTGCGGT AGTTTTTAAA CTGTTTGTTT TTAAACAAAC 1330 

IATAGAACTC TTCATTGTCA GCAAAGCAAA GAGTCACTGC ATCAATGAAA GTTCAAGAAC 1440 

CTCCTGTACT TAAACACGAT TCGCAACGTT CTGTTATTTT TTTTGTATGT TTAGAATGCT 1500 

GAAATGTTTT TGAAGTTAAA TAAACAGTAT TACATTTTTA AAACTCTTCT CTATTATAAC 1560 

40 AGTCAATTTC TGACTCACAG CAGTGAACAA ACCCCCACTC CATTGTATTT GGAGACTGGC 1620 

CTCCCTATAA ATGTGGTAGC TTCTTTTATT ACTCAGTGGC CAGCTCACTT AGGGCTGAGA 1680 

TGAAGGAGAG GGCTACTTGA AGCTACTGTG TGATTTTGTT TGTGTCTGAG TGGCATTCAG 1740 

ATGAAGTCTG GAGGAGTTAG GAGAACGACA TAGGCAAGGT TCAGCAGCCT TCCAAGGTAT 18 00 
AGGAAGGTGG GTGATTAGGA CTGAGGCTAT CTAGGTTTAA CTTTTGTCC " 

45 CCTATTTTGT GGGGCCAAAT GCATTGCTAA ACAGCAAT7T CAGAGTGTA 

AAATTAAGGC CTTATTGTTT TTCTCTTTCA CCCCTACCCC CCGTGCTCC 
ACATTATTTG TGGTGCCCAA CATTTGGGGT CTTGAGCCTG CTGCTGGTC 

AGTGAGGGTA TGTGGGATCG GGTGGTGGGG TAGGGGACGG TATCCTTTTT TTGCTCCTAC 2100 

TTGGAAACAC CAAACACCCC AAGGAAGATG ATAGGCTCCA TCTTGGGCCA CCTGAGCTAT 2160 

50 AGGGCAGGCT AATGGAATCA ACCATTTCTG AGCACTAAAT GTATCATGAA AAGTTGAATG 2220 

GCCTGCTCAT AAGTTTAGCT CATTCACTGG AAATGTAGAT TGATGTTCAA TGTTAAACTG 22 B0 

GAAGGAGCTT GGTTTGTGTG TCAGTGGTTA TATTAGTGGG TAGTGTAACA TTTTATCCAG 2340 

GTTGGGGTGA GGGGAGATGG CCACAGTAGC AAGTGGTGAC ACTAAATACC ATTTTGAAGG 2400 

CTGATGTGTA TATACATCAT TACTGTCCGT AGCAATGAAG GATACAGTAC TGTGTTGTGG 2 460 

55 GTGAGTGTTG CTATTGCCCA GCATTAATAT TTGGGTGTGT ATGTTTGAGG CTATGAAACA 2 520 

CGCAGGAGTG TTTTTGTGCT ATTAATTTTA AGAGAAAGCA GCTTTTTCTT AAAATTCACT 2 580 

GTTGAGAAAC TTGCATGTCT GGAGGCGGTG TCCTCTCCGC CCTGTCGGGT CCTGGATGAG 2 640 

TACGAGTTAT GGTCACGGTC ACAGCCTGAT CTCTTATGTG TTCATAGCCA TTCGCTCTCC 2700 

CATCAGAACT GTTTGTCCTG AATGTGTTCC TCTAGTTCTA GAAAATGACC ACTAATTTAA 2760 

60 AAAACTCGGT TGTGAGGTTT GCCCAGAGGC ACTTGTTCCA GAATTTCCCC TCCTGCTTCA 2820 

GCCATGTCCT TGTCACTTGG CATTCTAAGC TAAAGCTTTA GCTTCCCAAT TCGTGATGTG 2B80 

CTAGGCCAAG ATTCGGGAGC TGTTGCCAGC CTCGTCAAAT ATGGAAGAGA AACAACCTGC 2 940 

GGTCAAAAGG GAGTGATTTG TTAAGTGGTG CGCGTCTATC TCATAACTAG ATGTACCAAC 3 000 

CAGGGAAGGG CCAAGGATGG AAAGGGGTAA CTTTTGTGCT TCCAAAGTAG CTAAGCAGAA 3060 

65 GTGGGGGAGC AGTTTAGCCA GATGATCTTT GAT T AGGCAA ACATTGAGTT TTAAAGAGGC 3120 

TGTCAAGTTG AGGCCACTTG GTCCATTAGC TGGGGCAGCA AGATCACTAC TCAACGTTTT 3180 

CACACTGTGG CAAGATTGCT CTTCTAGTGG AATAATGCCC TAGTTTCTCT GAGATGATGT 3240 

AAGTGGCATG ATGTTACCTA AGGCTTAGGC TTAGCTTGAT TTCTGGGCCC ACTGTCTGTG 3300 

TTCTTAAGAT GCCAACCTGT TGCTTTTTTT TTTTTTTTCC CCCATTTAAA AGGATAGTAC 3360 

70 CTACTCCCTC TAACCACCTC ACCCCATTCT TGAATGACAT TTTATCCTTC GGAAAGAACA 3 420 

AGGCTGTGAT GTAGTGACTA TTGTCTGTGT CTCCTGTGTG TGTCTGTTCT TGTCACAAAT 3480 
GTATTTGGGG ACGTTGGATG CATTCATTTT CTGTAATAAA G 

Seq ID NO: 306 Protein sequence: 



MAKGDPKKPK GKMSAYAPFV QTCREEHKKK NPEVPVNFAE FSKKCSERWK T 

DEMAKADKVR YDREMKDYGP AKGGKKKKDP NAPKRPPSGF FLFCSEFRPK IKSTNPGISI 

GDVAKKLGEM WNNLNDSEKQ PYITKAAKLK EKYEKDVADY KSKGKFDGAK GPAKVARKKV 
EEEDEEEEEE EEEEEEEEDE 
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I 



I 



I 



I 



ATGGGTACTA GGAAAAAAGT TCATGCATTT GTCCGTGTCA 
CATGAAATGA TCAGATACGG AGATGACAAA AGAAGCATTG ATATTCACTT AAAAAAAGAC 
AT T CGGAGAG GAGTTGTCAA TAACCAACAG ACAGACTGGT CGTTTAAGTT GGATGGAGT? 
TTCACGATG CCTCCCAGGA CTTGGTTTAT GAGACAGTTG CAAAGGATGT GGTTTCTCAG 
CCCTCGATG GCTATAATGG CACCATCATG TGTTA7GGGC AGACGGGAGC TGGCAAGACA 
ACACCATGA TGGGGGCAAC TGAGAATTAC AAGCACCGGG GGATCCTCCC TCGTGCCCTG 
AGCAGGTTT TTAGGATGAT CGAAGAACGC CCCACACATG CCATCACTGT GCGTGTTTCC 
ACTTGGAAA TCTATAATGA GAGCCTGTTT GATCTCCTG? CCACTCTGCC C7ATG7TGGA 
CCTCAGTCA CACCAATGAC CATCGTGGAA AACCCTCAAG GAGTCTTCAT TAAGGGCTTG 
CAGTTCACC TCACAAGTCA GGAGGAGGAT GCATTCAGCC TCCTTTT7GA GGGTGAGACC 
ACAGGATTA TAGCCTCCCA CACTATGAAC AAAAACTCTT CCAGATCACA CTGCATTTTC 
CCATCTACT TAGAGGCCCA TTCCCGGACC TTATCAGAGG AAAAGTACAT 
TTAACTTGG TGGATCTGGC AGGCTCAGAG AGGCTGGGGA AGTCTGGGTC 
TCCTGAAGG AAGCCACCTA CATCAACAAA TCGCTCTCAT TCC7GGAGCA GGCCATCATT 
CCCTTGGGG ACCAGAAGCG GGACCACATC CCCTTTCGGC AGTGCAAGCT CACCCACGCT 
TGAAGGACT CGTTAGGGGG AAACTGCAAT ATGGTCCTCG TGACAAACAT CTATGGAGAA 
CTGCCCAGT TAGAAGAAAC GCTATCTTGA CTGAGATTTG CCAGCAGGAT GAAGCTAGTC 
CCACTGAGC CTGCCATCAA TGAAAAGTAT GATGCTGAGA GAATGGTCAA GAACCTGGAG 
AGGAACTAG CACTACTCAA GCAGGAGCTG GCTATCCATG ACAGCCTGAC CAACCGCACC 
TTGTGACCT ATGACCCCAT GGATGAAATC CAGATTGCTG AGATCAACTC CCAGGTGCGG 
GGTACCTGG AGGGGACACT GGACGAGATC GACATAATCA GCCTTAGACA GATCAAGGAG 
TGTTCAACC AGTTCCGGGT GGTTCTGAGC CAACAGGAAC AGGAAGTGGA GTCCACTTTG 
GCAGGAAGT ACACCCTCAT TGACAGGAAT GACTTTGCAG CCATTTCTGC TATCCAGAAG 
CGGGGCTTG TGGATGTTGA TGGCCACCTA GTGGGTGAGC CTGAAGGACA AAACTTTGGA 
TCGGAGTCG CCCCTTTCTC TACCAAACCT GGGAAGAAAG CCAAGTCCAA GAAGACATTC 
AAGAGCCAC TCAGGCCCGA CACCCCACCC TCCAAACCAG TGGCCTTTGA GGAGTTTAAG 
ATGAGCAAG GTAGTGAGAT CAACCGAATT TTCAAAGAAA ACAAATCCAT CTTGAATGAA 
GGAGGAAAA GGGCCAGCGA GACCACACAG CACATCAATG CCATCAAGCG GGAGATTGAT 
TGACCAAGG AGGCCCTGAA TTTCCAGAAG TCACTACGGG AGAAGCAAGG CAAGTACGAA 
ACAAGGGGC TGATGATCAT CGATGAGGAA GAATTCCTGC TGATCCTCAA GCTCAAAGAC 
TCAAGAAGC AGTACCGCAG CGAGTACCAG GACCTGCGTG ACCTCAGGGC TGAGATCCAG 
ATTGCCAGC ACCTAGTGGA TCAGTGTCGC CACCGCCTGC TCATGGAATT TGACATCTGG 
ACAATGAGT CCTTTGTCAT CCCTGAGGAC ATGCAGATGG CACTGAAGCC AGGCGGCAGC 
TCCGGCCAG GCATGGTCCC TGTGAACAGG ATTGTGTCTC TGGGAGAAGA TGACCAGGAC 
AATTCAGCC AGCTGCAGCA GAGGGTGCTT CCTGAGGGCC CTGATTCCAT CTCCTTCTAC 
ATGCCAAAG TCAAGATAGA GCAGAAGCAT AATTACTTGA AAACCATGAT GGGCCTCCAG 
AGGCACATA GAAAATAG 



Seq ID N 



I 

MGTRKKVHAF 

LHDASQDLVY ETVAKDWSQ 

* PTHAITVRVS YLEIYNESLF 
D AFSLLFEGET 
E RLCKSGSEGQ 
LKDSLGGNCN MVLVTNIYGE AAQLEETLSS 



CYGQTGAGKT 



I RRGWNNQQ TDWSFKLDGV 



VFNQFRWLS QQEQEVESTL 
LGVAPFSTKP GKKAKSKKTF KEPLRPDTPP 
RRKRASETTQ HINAIKREID VTKEALNFQK 
LKKQYRSEYQ DLRDLRAEIQ YCQHLVDQCR 
IRPGMVPVNR IVSLGEDDQD KFSQLQQRVL 



SLSFLEQAII 
LRFASRMKLV 
QIAEINSQVR 
DFAAISAIQK 



SLREKQGKYE 
PEGPDSISFY 



PSVTPMTIVE NPQGVFIKGL 
TIYLEAHSRT LSEEKYITSK 
ALGDQKRDKI PFRQCKLTHA 
DAERMVKNLE 
DIISLRQIKE 
VGEPEGQNFG 
NEQGSEINRI FKENKSILNE 
NKGLMI IDEE EFLLILKLKD 
YNESFVIPED MQMALKPGGS 
NAKVKIEQKH NYLKTMMGLQ 



AAATTTCTGC 
CACATTGAAG 
TAGATTGTCA 
CATTAGTATC 



k TGCCTGCTGT CATGCTCTGT CTACCAGGG7 GAATTTCCAA 



AGAAAAAACA 
ATTATACTTA 



ACCAAAGGAA AGAGTGAAGA Ai 
AGAAAAGTGG GCCAGAGGCC CCACCTCACA CTAGGACGGC AATTGCCTCT 
TCAGGCACCA TGGGTCTTAT TTGGTGTCAT AAGAAACACC CTCAACAAAG 
TCAGCCTCCA GCTTCTCTTC TTCGGGATTC TTCTTAGGGC CTCCTTT7TC 
TCCAGTACCC TGAATTTCTT ATTCCCATCC CCCATTAAAA TCTGCTTCAA 
AGAAGGACAC ATTCACTTTA AGATCCAAAT GAATGATAAG AGCTTAAAAC 
TCAGTATTAT TTGCATTTTT ATAGAAACCA AAACCATATT TCAACAAC 



\ sequence 



#i NM_018622 .2 



ATGGCGTGGC GAGGCTGGGC G 

GTGGGCGGCC GCAGCTGCGA GGAGCTCACT GCGGTCCTAA 



3 GCCAGGCGTG GGGTGCGTCG 



CGCAGGTTTA ACTTCTTTAT TCAACAAAAA T 



A GAAAAGCACC CAGGAAGGTT 180 



A GATCAGACCC AGGGACAAGT GGTGAAGCAT ACAAGAGAAG IGCTTTGATI 
CCTCCTGTGG AAGAAACAGT CTTTTATCCT TCTCCCTATC CTATAAGGAG ICTCATAAAA 
CCTTTATTTT TTACTGTTGG GTTTACAGGC TGTGCATTTG GATCAGCTGC TA7TTGGCAA 
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TATGAATCAC TGAAATCCAG GGTCCAGAGT TATTTTGATG GTATAAAAGC TGATTGGTTG 42 0 

GATAGCATAA GACCACAAAA AGAAGGAGAC TTCAGAAAGG AGAITAACAA GTGGTGGAAT 480 

AACCTAAGTG ATGGCCAGCG GACTGTGACA GGTATTATAG CTGCAAATGT CCTTGTATTC 540 

TGTTTATGGA GAGTACCTTC TCTGCAGCGG ACAATGATCA GATATTTCAC ATCGAATCCA S00 

3 TCCTTTGTTC TCCAATGTTG CTGTCAACAT TCAGTCACT? CTCCTTA7TT S60 



GGTCAAGAGC AGTTCATGGC AGTGTACCTA TCTGCAGGT3 T7ATT7CCAA TTTTGTCAGT 

C AGGAAGATAT GGACCATCAC T7GC-TGCATC TC-G7GCCATC 

C TCGCAGCTGT CTGCACTAAG ATCCCAGAAG GGA3GCTTGC CATTATTTTC 
CTTCCGATGT TCACGTTCAC AGCAGGGAAT GCCCTGAAAG CCATTA1 

GCAGGAATGA TCCTGGGATG GAAATTTTTT GATCATGCGG CACATCTTGG G 

TTTGGAATAT GGTATGTTAC TTACGGTCAT GAACTGATTT GGAAGAACAG GGAGCCGCTA 

T AAGGACTAAT GGCCCCAAAA AAGGAGGTGG CTCTAAGTAA 



G WGCGQAWGAS VGGRSCEELT AVLTPPQLLG RRFNFFIQQK CGFRKAPRKV 
EPRRSDPGTS GEAYKRSALI PPVEETVFYP SPYPIRSLIK PLFFTVGFTG CAFGSAAIWQ 
YESLKSRVQS YFDGIKADWL DSIRPQKEGD FRKEINKWWN NLSDGQSTVT GIIAANVLVF 
CLWRVPSLQR TMIRYFTSNP ASKVLCSPML LSTFSHFSLF HMAANMYVLW 3FSS3IVNIL 
GQEQFMAVYL SAGVISNFVS YLGKVATGRY GPSLGASGAI MTVLAAVCTK IPEGRLAIIF 
LPMFTFTAGN ALKAIIAMDT AGMILGWKFF DHAAHLGGAL FGIWYVTYGH ELIWKNREPL 
VKIWHEIRTN GPKKGGGSK 

Seq ID NO: 312 DNA sequence 
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CTCTCGGCCA CCTTTGATGA GGGGACTGGG CAGTTCTAGA CAGTCCCGAA GTTCTCAAGG 
CACAGGTCTC TTCCTGGTTT GACTGTCCTT ACCCCGGGGA GGCAGTGCAG CCAGCTGCAA 
GCCCCACAGT GAAGAACATC TGAGCTCAAA TCCAGATAAG TGACATAAGT GACCTGCTTT 
GTAAAGCCAT AGAGATGGCC TGTCCTTGGA AATTTCTGTT CAAGACCAAA TTCCACCAGT 
ATGCAATGAA TGGGGAAAAA GGCATCAACA ACAATGTGGA GAAAGCCCCC TGTGCCACCT 
CCAGTCCAGT GACACAGGAT GACCTTCAGT ATCACAACCT CAGCAAGCAG CAGAATGAGT 
CCCCCCAGCC CCTCGTGGAG ACGGGAAAGA AGTCTCCAGA ATCTCTGGTC AAGCTGGATG 
CAACCCCATT GTCCTCCCCA CGGCATGTGA GGATCAAAAA CTGGGGCAGC GGGATGACTT 
TCCAAGACAC ACTTCACCAT AAGGCCAAAG GGATTTTAAC TTGCAGGTCC AAATCTTGCC 
TCGGCTCCAT TATGACTCCC AAAAGTTTGA CCAGAGGACC CAGGGACAAG CCTACCCCTC 
CAGATGAGCT TCTACCTCAA GCTATCGAAT TTGTCAACCA A7ATTACGGC TCCCTCAAAG 
AGGCAAAAAT AGAGGAACAT CTGGCCAGGG TGGAAGCGGT AACAAAGGAG A 
CAGTAACCTA CCAACTGACG GGAGATGAGC T 

ATGCCCCACG CTGCATTGGG AGGATCCAGT GGTCCAACCT GCAGGTCTTC G 
GCTGTTCCAC TGCCCGGGAA ATGTTTGAAC ACATCTGCAG ACACGTGCGT 
ACAATGGCAA CATCAGGTCG GCCATCACCG TGTTCCCCCA GCGGAGTGAT 
ACTTCCGGGT GTGGAATGCT CAGCTCATCC GCTATGCTGG CTACCAGATG CCAGATGGCA 1020 
GCATCAGAGG GGACCCTGCC AACGTGGAAT TCSCTCAGCT GTGCATCGAC CTGGGCTGGA 1080 
AGCCCAAGTA CGGCCGCTTC GATGTGGTCC CCCTGGTCCT GCAGGCCAAT GGCCGTGACC 1140 
CTGAGCTCTT CGAAATCCCA CCTGACCTTG TGCTTGAGGI GGCCATGGAA CATCCCAAAT 12 00 
ACGAGTGGTT TCGGGAACTG GAGCTAAAGT GGTACGCCCT GCCTGCAGTG GCCAACATGC 1260 
TGCTTGAGGT GGGCGGCCTG GAGTTCCCAG GGTGCCCCTT CAATGGCTGG TACATGGGCA 132 0 
CAGAGAT CGG AGTCCGGGAC TTCTGTGATG TCCAGCGCTA CAACATCCTG GAGGAAGTGG 1380 
GCAGGAGAAT GGGCCTGGAA ACGCACAAGC TGGCCTCGCT CTGGAAAGAC CAGGCTGTCG 1440 
TTGAGATCAA CATTGCTGTG CTCCATAGTT TCCAGAAGCA GAATGTGACC ATCATGGACC 1500 
ACCACTCGGC TGCAGAATCC TTCATGAAGT ACATGCAGAA TGAATACCGG TCCCGTGGGG 1560 
GCTGCCCGGC AGACTGGATT TGGCTGGTCC CTCCCATGTC TGGGAGCATC ACCCCCGTGT 1620 
TTCACCAGGA GATGCTGAAC TACGTCCTGT CCCCTTTCTA CTACTATCAG GTAGAGGCCT 1680 
GGAAAACCCA TGTCTGGCAG GACGAGAAGC GGAGACCCAA GAGAAGAGAG ATTCCATTGA 1740 
AAGTCTTGGT CAAAGCTGTG CTCTTTGCCT GTATGCTGAT GCGCAAGACA ATGGCGTCCC 1800 
GAGTCAGAGT CACCATCCTC TTTGCGACAG AGACAGGAAA ATCAGAGGCG CTGGCCTGGG 1860 
ACCTGGGGGC CTTATTCAGC TGTGCCTTCA ACCCCAAGGT TGTCTGCATG GATAAGTACA 192 0 
GGCTGAGCTG CCTGGAGGAG GAACGGCTGC TGTTGGTGGT GACCAGTACG TTTGGCAATG 1980 
GAGACTGCCC TGGCAATEGA GAGAAACTGA AGAAATCGCT CTTCATGCTG AAAGAGCTCA 2 040 
ACAACAAATT CAGGTACGCT GTGTTTGGCC TCGGCTCCAG CATGTACCCT CGGTTCTGCG 2100 
CCTTTGCTCA TGACAT TGAT CAGAAGCTGT CCCACCTGGG GGCCTCTCAG CTCACCCCGA 2160 
TGGGAGAAGG GGATGAGCTC AGTGGGCAGG AGGACGCCTT CCGCAGCTGG GCCGTGCAAA 2220 
CCTTCAAGGC AGCCTGTGAG ACGTTTGATG TCCGAGGCAA ACAGCACATT CAGATCCCCA 22 80 
3 CTCCAATGTG ACCTGGGACC CGCACCACTA CAGGCTCGTG CAGGACTCAC 2340 
3AGCAAA GCCCTCAGCA GCATGCATGC C 
GGCTCAAATC TCGGCAGAAT CTACAAAGTC CGACATCCAG CCGTGCCACC A 
AACTCTCCTG TGAGGATGGC CAAGGCCTGA ACTACCTGCC GGGGGAGCAC CTTGGGGTTT 
GCCCAGGCAA CCAGCCGGCC CTGGTCCAAG GCATCCTGGA GCGAGTGGTG GATGGCCCCA 
CACCCCACCA GGCAGTGCGC CTGGAGGCCC TGGATGAGAG TGGCAGCTAC TGGGTCAGTG 
ACAAGAGGCT GCCCCCCTGC TCACTCAGCC AGGCCCTCAC C7ACT7CCTG GACATCACCA 
CACCCCCAAC CCAGCTGCTG CTCCAAAAGC TGGCCCAGGT GGCCACAGAA GAGCCTGAGA 
GACAGAGGCT GGAGGCCCTG TGCCAGCCCT C 
GCCCCACATT CCTGGAGGTG CTAGAGGAGT T 
TGCTTTCCCA GCTCCCCATT CTGAAGCCCA GGTTCTACTC CATCAGCTCC CCCCGGGATC 2 940 
ACACGCCCAC GGAGATCCAC CTGACTGTGG CCGTGGTCAC CTACCACACC CGAGATGGCC 3 000 
AGGGTCCCCT GCACCACGGC GTCTGCAGCA CATGGCT CAA CAGCC7GAAG CCCCAAGACC 3060 
CAGTGCCC1G CTTTGTGCGG AATGCCAGCG GCTTCCACCT CCCCGAGGAT CCC7CCCATC 3120 



301 



WO 02/086443 PCT/US02/12476 

CTTGCATCCT CATCGGGCCT GGCACAGGCA TCGCGCCCTT CCGCAGTTTC TGGCAGCAAC 3180 

GGCTCCATGA CTCCCAGCAC AAGGGAGTGC GGGGAGGCCG CATGACCTTG GTGTTTGGGT 3240 

GCCGCCGCCC AGATGAGGAC CACATCTACC AGGAGGAGAT GCTGGAGATG GCCCAGAAGG 3300 

GGGTGCTGCA TGCGGTGCAC ACAGCCTATT CCCGCCTGCC TGGCAAGCCC AAGGTCTATG 3360 

TTCAGGACAT CCIGCGGCAG CAGCTGGCCA GCGAGGTGCT CCGTGTGCTC CACAAGGAGC 342 0 

CAGGCCACCT CTATGTTTGC GGGGATGTGC GCATGGCCCG GGACGTGGCC CACACCCTGA 3480 

AGCAGCTGGT GGCTGCCAAG CTGAAATTGA ATGAGGAGCA GGTCGAGGAC TATTTCTTTC 3540 

AGCTCAAGAG CCAGAAGCGC TATCACGAAG ATATCTTTGG TGCTGTATTT CCT7ACGAGG 3600 

CGAAGAAGGA CAGGGTGGCG GTGCAGCCCA GCAGCCTGGA GATGTCAGCG CTCTGAGGGC 3660 

CTACAGGAGG GGTTAAAGCT GCCGGCACAG AACTTAAGGA TGGAGCCAGC TCTGCATTAT 372 0 

CTGAGGTCAC AGGGCCTGGG GAGATGGAGG AAAGTGATAT CCCCCAGCCT CAAGTCTTAT 3780 

TTCCTCAACG TTGCTCCCCA TCAAGCCCTT TACTTGACCT CCTAACAAGT AGCACCCTGG 3840 
ATTGATCGGA GCCTC 
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MACPWKFLFK TKFHQYAMNG EKGINNNVEK APCATSSPVT QDDLQYHNLS KQQNESPQPj 60 

VETGKKSPES LVKLDATPLS SPRHVRIKNW GSGMTFQDTL HHXAKGILTC RSKSCLGSIM 120 

TPKSLTRGPR DKPTPPDELL PQAIEFVNOY YGSLKEAKIE EHLARVEAVT KEIETTVTYQ 180 

LTGDELIFAT KQAWRNAPRC IGRIQWSNLQ VFDARSCSTA REMFEHICRH VRYSTNNGNI 240 

RSAITVFPQR SDGKHDFRVW NAQLIRYAGY QMPDGSIRGD PANVEFTQLC IDLGWKPKYG 300 

RFDWPLVLQ ANGRDPELFE IPPDLVLEVA MEHPKYEWFR ELELKWYALP AVANMLLEVG 360 

GLEFPGCPFN GWYMGTEIGV RDFCDVQRYN ILEEVGRRMG LETHKLASLW KDQAWEIIII 420 

AVLHSFQKQN VTIMDHHSAA ESFMKYMQNE YRSRGGCPAD WIWLVPPMSG SITPVFHQEM 480 

LNYVLSPFYY YQVEAWKTHV WQDEKRRPKR REIPLKVLVK AVLFACHLMR XTMASRVRVT 540 

ILFATETGKS EALAWDLGAL FSCAFNPKW CMDKYRLSCL EEERLLLWT STFGNGDCPG 600 

NGEKLKKSLF MLKELNNKFR YAVFGLGSSM YPRFCAFAHD IDQKLSHLGA SQLTPMGEGD 660 

ELSGQEDAFR SWAVQTFKAA CETFDVRGKQ HIQIPKLYTS NVTWDPHHYR LVQDSQPLDL 720 

SKALSSMHAK NVFTMRLKSR QNLQSPTSSR ATILVELSCE DGQGLNYLPG EHLGVCPGHQ 780 

PALVQGILER WDGPTPHQA VRLEALDESG SYWVSDKRLP PCSLSQALTY FLDITTPPTQ 840 

LLLQKLAQVA TEEPERQRLE ALCQPSEYSK WKFTNSPTFL EVLEEFPSLR VSAGFLLSQL 900 

PILKPRFYSI SSPRDHTPTE IHLTVAWTY HTRDGQGPLH HGVCSTWLNS LKPQDPVPCF 960 

VRNASGFHLP EDPSHPCILI GPGTGIAPFR SFWQQRLHDS QHKGVRGGRM TLVFGCRRPD 1020 

EDHIYQEEML EMAQKGVLHA VHTAYSRLPG KPKVYVQDIL RQQLASEVLR VLHKEPGHLY 1080 



11 21 31 41 

I I I 

TTTACAGATA AAACTGGTAC ACTGACAGAA AATGAGATGC 
ATGTTCAATT AATGGCATGA AATACCAAGA AATTAATGGT AGACTTGTAC 
AACACCAGAC TCTTCAGAAG GAAACTTATC TTATCTTAGT A 
CTTATCCCAT CTTACAACCA GTTCCTCTTT CAGAACCAGT C 
3ATCTCT TCTTTAAAGC AGTCAGTCTC T 
h ACTGACTGCA CTGGTGATGG TCCCTGGCAA TCCAACCTGG C 

GTTGGAGTAC TATGCATCTT CACCAGATGA AAAGGCTCTA GTAGAAGCTG CTGCAAGGAT 420 

TGGTATTGTG TTTATTGGCA ATTCTGAAGA AACTATGGAG GTTAAAACTC TTGGAAAACT 480 

GGAACGGTAC AAACTGCTTC ATATTCTGGA ATTTGATTCA GATCGTAGGA GAATGAGTGT 540 

AATTGTTCAG GCACCTTCAG GTGAGAAGTT ATTATTTGCT AAAGGAGCTG AGTCATCAAT 600 

TCTCCCTAAA TGTATAGGTG GAGAAATAGA AAAAACCAGA ATTCATGTAG ATGAATTTGC 660 

TTTGAAAGGG CTAAGAACTC TGTGTATAGC ATATAGAAAA TTTACATCAA AAGAGTATGA 720 

GGAAATAGAT AAACGCATAT TTGAAGCCAG GACTGCCTTG CAGCAGCGGG AAGAGAAATT 780 

GGCAGCTGTT TTCCAGTTCA TAGAGAAAGA CCTGATATTA CTTGGAGCCA CAGCAGTAGA 840 

AGACAGACTA CAAGATAAAG TTCGAGAAAC TATTGAAGCA TTGAGAATGG CTGGTATCAA 900 

AGTATGGGTA CTTACTGGGG ATAAACATGA AACAGCTGTT AGTGTGAGTT TATCATGTGG 960 

CCATTTTCAT AGAACCATGA ACATCCTTGA ACTTATAAAC CAGAAATCAG ACAGCGAGTG 1020 

TGCTGAACAA TTGAGGCAGC TTGCCAGAAG AATTACAGAG GATCATGTGA TTCAGCATGG 1080 

GCTGGTAGTG GATGGGACCA GCCTATCTCT TGCACTCAGG GAGCATGAAA AACTATTTAT 1140 

GGAAGTTTGC AGAAATTGTT CAGCTGTATT ATGCTGTCGT ATGGCTCCAC TGCAGAAAGC 1200 

AAAAGTAATA AGACTAATAA AAATATCACC TGAGAAACCT ATAACATTGG CTGTTGGTGA 12 60 

TGGTGCTAAT GACGTAAGCA TGATACAAGA AGCCCATGTT GGCATAGGAA TCA7GGGTAA 132 0 



TTTTTTTTAT AAGAATGTGT GCTTTATCAC ACCCCAGTTT TTATATCAGT TCTACTGTTT 1500 

GTTTTCTCAG CAAACATTGT ATGACAGCGT GTACCTGACT TTATACAATA TTTGTTTTAC 1560 

TTCCCTACCT ATTCTGATAT ATAGTCTTTT GGAACAGCAT GTAGACCCTC ATGTGTTACA 1620 

AAATAAGCCC ACCCTTTATC GAGACATTAG TAAAAACCGC CTCTTAAGTA TTAAAACATT 1680 

TCTTTATTGG ACCATCCTGG GCTTCAGTCA TGCCTTTATT TTCTTTTTTG GATCC1ATTT 1740 

ACTAATAGGG AAAGATACAT CTCTGCTTGG AAATGGCCAG ATGTTTGGAA ACTGGACATT 1800 
TGGCACTTTG GTCTTCACAG TCATGGTTAT TACAGTCACA GTAAAGf 
TCATTTTTGG ACTTGGATCA ACCATCTCGT TACCTGGGGA TCTATTATAT T 
ATTTTCCTTG TTTTATGGAG GGATTCTCTG GCCATTTTTG GGCTCCCAGA A 

TGTGTTTATT CAGCTCCTGT CAAGTGGTTC TGCTTGGTTT GCCATAATCC TCATGGTTGT 2 040 

TACATGTCTA TTTCTTGATA TCATAAAGAA GGTCTTTGAC CGACACCTCC ACCCTACAAG 2100 

TACTGAAAAG GCACAGCTTA CTGAAACAAA TGCAGGTATC AAGTGCTTGG ACTCCATGTG 2160 

CTGTTTCCCG GAAGGAGAAG CAGCGTGTGC ATCTGTTGGA AGAATGCTGG AACGAGTTAT 2220 

AGGAAGATGT AGTCCAACCC ACATCAGCAG ATCATGGAGT GCATCGGATC CTTTCTATAC 2280 

CAACGACAGG AGCATCTTGA CTCTCTCCAC AATGGACTCA TCTACTTGTT AAAGGGGCAG 2340 
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TAGTACTTTG TGGGAGCCAG 
ATGGCCACAC TAGCTCTGAA 
GAGTTATAAT GGCAAACAAA 
TGAATCTGAA CATGTTAAAA 
TGTCCCTTGT GCTTATGGGA 
TTTAATATAA ATGTAGAAAA 
TTGATTATTG ACTCTTCTAT 
AGAACTCTAT TTTTTTATTA 
ATACTGAGGA ATTTTGGTCC 
TTCACAGAGC AAATTAGGAG 
ATTTATACCA ATTCCTCTAA 
CAAGGGTATA TCATATATAC 
T ATTTTTGTGA 



TTCACCTCCT 
ATTAATTTCC 
CAGAAAGCAT 
T TTGAGAAT A 
CTCCTAATGG 



TTAAATCTGC 
GAGTTATATT 
CTCAGTGACC 
AATCATTTCC 
CTGTACTGTA 
AAATCAGGAA 



TTCCTAAAAT 
AAAATCTTTG 
TAGTACAAGC 
AAGAGACATT 
CATTTCAGTC 
TCTTAGTAAA 
TTCTGTAAAT 
TAAAGCTTTT 
TGTGTTGTTA 



GTCTTTAT 



CTCTTTCTGC 
AGGGTTTCAG 
CAATACAGTG 
TGTAAAAATG 



T TATACAAATC 



ATATCAATAA 



ACACAGCCTG 
AGACCCTATA 
CCTGAAGGCT 
GGGTAATTAA 



TCAGTGTGAT 
TAGTAGTTCA 
CCCTCCCAAC 
TTTCATCTCT 
TGTTGCTGAG 
GAGTATTTTT 
TATGCTGAAA 
CATGGGAAAA 
ATTCATTAAT 
TACTGCAGTA 
TAAAGTTAGC 
TCACOGAACT 
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GCCATTATAT 
TAGTATTAGC 
GTTTGCCTTG 
GTTAATGTGA 
GCATTCTGAG 



CAGTATTTAA 



CATATAAATG 
TCAAATTGAT 
TAGATACTAT 



AATTTATCAA 



AATTCCTTTA 



CATGTTACTG 
GTAGTTCAGT 
GAAGTGATTA 
ATTAAGATAA 
TGGAGATTTA 
ATTGGACTGG 



T AAGTTATTAA 
T AGCTGGATTT 
CAACGTACAA TGTCTGCATT 
CTCAGTAGAG TACTAGGTGG 
TGCATATAAC 
GCCATCAAAT 
TGACCAACTG 
ATGTAATTTT 
AAGTAAATGG 
TTAAAGCAAA 
CAATGCAGTC 



TATTTTCAGG 
GTGTCCCAGT ACAAGGCATA 
ATCAGGTAAT GTTAGCAATA 
TAATTAACTC ATTGCACTTC 
ATTGTCATTT GTTTTTGTTT 



TATTTCATGA 
ATATTGCAAA 
AAAGAAGAGC 
TTTAGAATAA 
CAAACTACCA 
GGCTGTGGAA 



CGGTTTTACC 
TTGTGCAGCC CTAAGCTTCC 
CAGGGGAAAG AATGGTAGAG 



GCTAAATATT 
CACTAATTCA 



AAACTGAGTA C 



AAATTTGCTC 
TGCATTACAT 
ACAAAGACTC 



CTTTGGAAAG 
GCTAAATACG 
GAGGGCTGTA 
GCTGCTTTAA 
GGAGCAAAGT 



ATAGGCAGGA 



TTTTCCTGTA 
CAAAGAGTTA 
CTATTTTGAA 
TGTGAAATAA 
AATCATGGTA 
GGTTTAAAAT 



CTGTACTCAC 
CAACCACTAG 
ATTATCTTGT 
TGCAAGCTTT 
CAGTAATCAC 
TATGATGTTG 
TTATTGCTAA 
AGCCTGAAGA 
GTGACTCAGC 
ATGGGCCAGG 
TTATGGTCTG 
GCTTCCAGAC 
TCAACCCTAG 
TACTGGAAAA 
GTCAAGGCCA 
AAAAATCTCG 



CATTTGAAGT 

GATTTTAAGA 
CAGTAGTTTT 
AAGGAAAGTG 
ATAATTAACT 



TAGHSGCAAG 
ACCCTGCCTC 
GAGAACTACA 
AATTTTCTAA 
CTGAGCCACA 
GAAATACTTG 
GATTGTGAGA 
CTAGAAAATT 
TGAAGGCTGT 
TGGGGTATGA 



TAGTTAAGGA 

CTGAACTGTT 
AAAGAGTTTT 
CTAGTGCTAT 
TCCCCTTTGC 
TACCCTTATC 
CAAATCGATT 
CACCAAGTCA 
AGCTTCAGCA 
GCTACGAAGA 
CTGTCCTC7T 
CCCAGGCCCT 
CATTCTGCCC 
TTGAACTTAT 
GACAGTTAAG 
AGGAAAAGGG 
TATAAGCAGG 
TTTTGGTCCC 
AAAGTAAAAC 



GAACTTTATT 
ACTCCAAATC 
CTATTTATIT 
AT7CATCCTG 
ATATTTCTTT 
TGCCAAAACC 



GTTTCCAAAA 
GGCGTAGGCT 
CCTGCTGTCG 



ATCCTGAACA 
TACGGTTAGT 
CTGATCGCTT 
AGCCAAAAGT 
AGAATCTTCC 
TATTAATAAA 



GAAGTTATAT 



GTAAATGTAG 



AGACACTGTG 
AAAACTTTCC 



TACAGCTACT 



3120 
3180 
3240 
3300 

3420 

3540 



:tccatcctg 
tattgaaaag taatgttgtc 
caaataatga 
ttgaaaatgt 
ttcccatttc atgaatataa 
acagaaatta agactttatc 



AGTGTGATTG 
TTTTTAAAAA 
CACTCCGTTT 



TAAAACTCTT 
AAAATTCTTT 
AGAGCAAAAT 
TGCCTCGTCT 



AGTTGAGAAA 
AAAGCTCATA 
GGAGACTAAA 
ACCAGGAC7G 
GAGACTCCTA 



TGCTGTTTTA 
ATAACTCATG ATATGTTTGT 
GAAATATTGT AGCAATACTT 
GTAATCCTTT AAAAATTCTC 
CATAATTTTT TAAAGTTTAT 



MQFRECSING 
NETELIKEHD 
AAARIGIVPI 
AESSILPKCI 
REEKLAAVFQ 



MKYQEINGRL 
GKSEETMEVK 



EKLFMEVCRN 
GIMGKEGRQA 
QPYCLFSQQT 
SIKTFLYWTI 
MALETHFWTW 
ILMWTCLFL 
LERVIGRCSP 



FIEKDLILLG 
MNILELINQK 
CSAVLCCRMA 
ARNSDYAIAR 
LYDSVYLTLY 
LGFSHAFIFF 
INHLVTWGSI 
DIIKKVFDRH 
THISRSWSAS 



VPEGETPDSS 
TVQISNVQTD 
TLGKLERYKL 
VDEFALKGLR 
ATAVEDRLQD 
SDSECAEQLR 
PLQKAKVIRL 
FKFLSKLLFV 
NICFTSLPIL 
FGSYLLIGKD 
I FYFVFSLFY 
LHPTSTEKAQ 
DPFYTNDRSI 



TLCIAYRKFT 
KVRETIEALR 
QLARRITEDH 



SHLMKLSHLT 
LAPSQLEYYA 
RRMSVIVQAP 



GGILWPFLGS 
LTLSTMDSST 



VIQHGLWDG 
LAVGDGANDV 
TLVQYFFYKN 
PIT/LQNKPTL 
GW'ITFGTLVF 
QNMYFVFIQL 
LDSMCCFPEG 



IFEARTALQQ 
GDKHETAVSV 
TSLSLALREH 
SMIQEAHVGI 
VCFITPQFLY 
YRDISKNRLL 
TVMVITVTVK 
LSEGSAWFAI 
EAACASVGRM 



Seq ID NO: 316 DNA sequence 
Nucleic Acid Accession #s NM_004473 
Coding sequencei 661.. 1791 



CTCGCCAGCG GTCCGCGGGG CTGGAGACCC ACGCCGTGGA GAGGACCAGC CTCAGGTCGC 
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CCCGCCTGGG CCCGCGCCCC CACCTCGCTC CCCCCGCCTC CCCTCTCTGC CCGTGGCGCT 120 

•TACCGCCACC TTGGCCTCGG GGGCAGGGCA TGGGCG3CCC CCGCCAGATC GCCCAGCGCC 130 

AGTACTAACT GCCCTCGCTC TGGCCTTCGA GCCCGAAGCC TCTTCTGCGC GCACAACCTA 240 

GGCAGTAATC CTAAACTAGC CGCCACCACA CACCACCTGC AGCCACCCC7 ACCCAGGGAT 300 
CACTTCCGGA CCCCTCGACC GCCCGGCACC AGCGCGCAAC Zh A C GCGC 

AGAGTCCAGT CCCGGTCGCG AGGCCACCGC CGCTGCCCGC CTCGAGAAGC ACAACGCGGG 420 

CTGAGCCGTC GGCTAGCGGG TCACTCCCGA GCCTCTGTCT GCACCGCGCC AGCCCCAGAC 4 80 

CACGGACGCT GAGCCTCCAG CGCGCGCCAG CCTGGGCCGC TGGGCTCTCC GGGCCAGCCC 540 
GCGACGATCC CCTGAGCTCT CCGCAGAAGG GCCGAGCGTC CGTTCCGGGG ACGCCAGGCC 600 
CGCCCCCGCC CCCGGACAGC CGCGGGGATC CAGAGCCCGG GGGTGCGGGA CGCCCGCGCC 660 
ATGACTGCCG AGAGCGGGCC GCCGCCGCCG CAGCCGGAGG TGCTGGCTAC CGTGAAGGAA 72 0 
GAGCGCGGCG AGACGGCAGC AGGGGCCGGG GTCCCAGGGG AGGCCACGGG CCGCGGGGCG 7 80 
GGCGGGCGGC GCCGCAAGCG CCCCCTGCAG CGCGGGAAGC CGCCCTACAG CTACATCGCG 840 
CTCATCGCCA TGGCCATCGC GCACGCGCCC GAGCGCCGCC TCACGCTGGG CGGCATCTAC 900 
AAGTTCATCA CCGAGCGCTT CCCCTTCTAC CGCGACAACC CCAAAAAGTG GCAGAACAGC 960 

ATCCGCCACA ACCTCACACT CAACGACTGC TTCCTCAAGA TCCCGCGCGA GGCCGGCCGC 1020 

CCGGGTAAGG GCAACTACTG GGCGCTCGAC CCCAACGCGG AGGACATGTT CGAGAGCGGC 1080 

AGCTTCCTGC GCCGCCGCAA GCGCTTCAAG CGCTCGGACC TCTCCACCTA CCCGGCTTAC 1140 

ATGCACGACG CGGCGGCTGC CGCAGCCGCC GCTGCCGCAG CCGCCGCCGC CGCCGCCGCC 1200 

GCCGCCATCT TCCCAGGCGC GGTGCCCGCC GCGCGCCCCC CCTACCCGGG CGCCGTCTAT 1260 

GCAGGCTACG CGCCGCCGTC GCTGGCCGCG CCGCCTCCAG TCTACTACCC CGCGGCGTCG 1320 

CCCGGCCCTT GCCGCGTCTT CGGCCTGGTT CCTGAGCGGC CGCTCAGCCC AGAGCTGGGG 1380 

CCCGCACCGT CGGGGCCCGG CGGCTCTTGC GCCTTTGCCT CCGCCGGCGC CCCCGCTACC 1440 

ACCACCGGCT ACCAGCCCGC AGGCTGCACC GGGGCCCGGC CGGCCAACCC CTCTGCCTAT 1500 

GCGGCTGCCT ACGCGGGCCC CGACGGCGCG TACCCGCAGG GCGCCGGCAG TGCGATCTTT 1560 

GCCGCTGCTG GCCGCCTGGC GGGACCCGCT TCGCCCCCAG CGGGCGGCAG CAGTGGCGGC 1620 

GTGGAGACCA CGGTGGACTT CTACGGGCGC ACGTCGCCCG GCCAGTTCGG AGCGCTGGGA 1680 

GCCTGCTACA ACCCTGGCGG GCAGCTCGGA GGGGCCAGTG CAGGCGCCTA CCATGCTCGC 1740 

CATGCTGCCG CTTATCCCGG TGGGATAGAT CGGTTCGTGT CCGCCATGTG AGCCAGCGTA 1800 

GGGACGAAAA CTCATAGACA CATCGGCTGT TCACACGTTC CCCGCAACCT GAGAACGAAC 1860 

AGGAATGGAG AGAGGACTCA ACTGGGACCC ACGTGGAAAA GACCGAGCAG GCCACAGAGG 1920 

CTCGGTCTCC CCGCGCACAG CGTAGGCACC CTGTGTACTC TGTAAACGGG AGGAGGTGGG 1980 

Z CAGAGCCCTT GGACTGGCAC AGGGACCCTC GATGGAGCGA AGCCCTCAAA 2 040 

3GCATTC TATCGGGGAG GGTCCTTGGC GGTAACCAGA GGGCAGCGTA 2100 

\GACCAG GATCCAAATT GTGGGGAATC AGTTTCAGCC TTCCATGTGC 2160 

TGCCGGAACT CGGGCCTTTT TACGCGGTTC GTCCTCTAGT GCCTTTAACT GCGTTACTAC 2220 

AATAAAAGGC TGCGGCAGCG CCTTTCTTCT TAAAGTGAGG AGGACAAATT TGCAAAAGAA 2280 

ATAGGCTTTT CTTCTTTTTT AAATTGGAGA AATCTCTGCT CTGGTTGACC TGGGCTGGTT 2340 

TTCCCTGTCT CTGAGAACTT GAGACCTAGC TCCGAGTTGA ACTGTGCGTC AGCACTCCAG 2400 

TCCCATCACC TGAACCTTCA GTCTCCCCCA TCTGTTACAC TAGAGGGCTG CAGGACTCTA 2460 

TCCACCGCCC CCGGGTTATC ATTCAGGGCC CCATCATCTT GGATGCTGCC CTGCGTATTT 2520 

GGCAGCAATG GTGGGCCACC CAGGGCCTCT GAGTAGCCAC CCAAAGCCTA GCCGCTGTTC 2580 

TAGGGAACGG AAAAGAGTTC ATGGCCAAGC GTCTAACCTA AAGTCCCAGG ATTGGCTCCA 2 640 

GGCAGCAATT ATATCATAAC TTATTGAACT TTTGAGCAGG ACGTGCTGGT AATTTCATGG 2700 

CTGTTACTGC CCAGTCATAA ATCTGCTTTT CCATTATAAG GCAGAGAGAA GTACATTCGT 2760 

TCATTTGTCC ACTGTTTCTT GTCATCACGC AGCCCTGGAC CCAAAGGGTG AAC7AAAGTT 2 32 0 

TAAGGAGATG AGAGGATTCA AGGAGCCCGT TGGTGACGCC TTTCAGTAGC TGGGGAGGGC 2 880 

TCTTCCATCC CCAGCACCCC CTGCTACACC TCAGCAGCCT CCCCCATGCA AAAAGGAAAG 2 94 0 

AGAAAAATTA AGTTAGGGCA GTCAGTAAAG TGAGCTT7AG AAAGAAACTG GAA7TTTAAC 3 0 00 

TTCATTTTGT ATCTTGCTTA AGTAGCAGGC TCACTAAAAT TAGAGAAAGT CCAATAACTC 3060 

TCCCCCTTTC CCTTGAGAAA TCTTTAAGTT TCGATTCTGG AG CAAAAAC T TTCAGCATTA 3120 

AATATTTCAG AGGCTCCATT CACAGCTTTC AGATAAACTG GAGTGTTCAG ATGGACTGTT 3180 

TTAATAAAAA TCTTTGAGCA AGTGAGTTAT GGCAAGAGAA ACTCAGCCTC TTTCTGTATA 3240 

AACTTAACAC GGAAGGGCTG GGGTGTGAAA AAGAAGATTG TATGAAAACC ATTGGTAATT 3 300 

TTTATTTTTT ATTTTTGGGA CTGCACTATC CTGTTCACGA AGACATGTGA ACTTGGTTCA 3360 

GTCCAAATGG GGATTTGTAT AAACCAGTGC TCTCCATTAG AAATATGGTG CAAGCCACAT 3420 

ATGTAATTTT AAATATTCTA GTAGCCACAT TAATAAAGTN AAAAGAAACA AAAAAAAAAA 3480 
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FKHLTHYROI DTRANSCRIP TIONFACTOR TTFMTAESGP PPPQPEVLAT VKEERGETAA 
GAGVPGEATG RGAGGRRRKR PLQRGKPPYS YIALIAMAIA HAPERRLTLG GIYKFITERF 
PFYRDNPKKW QNSIRHNLTL NDCFLKIPRE AGRPGKGNYW ALDPNAEDMF ESGSFLRRRK 
RFKRSDLSTY PAYMHDAAAA AAAAAAAAAA AAAAAIFPGA VPAARPPYPG AVYAGYAPPS 
LAAPPPVYYP AASPGPCRVF GLVPERPLSP ELGPAPSGPG GSCAFASAGA PATTTGYQ PA 
GCTGARPANP SAYAAAYAGP DGAYPQGAGS AIFAAAGRLA G: 



YGRTSPGQFG ALGACYNPGG QLGGASAGAY HARHAAAYPG GIDRFVSAM 
D NO: 318 DNA sequence 



CCGGGCAGGT GGCTCATGCT CGGGAGCGTG GTTGAGCGGC TGGCGCGGTT GTCCTGGAGC 
AGGGGCGCAG GAATTCTGAT GTGAAACTAA CAGTCTGTGA GCCCTGGAAC CTCCGCTCAG 
AGAAGATGAA GGATATCGAC ATAGGAAAAG AGTATATCAT CCCCAGTCCT GGGTATAGAA 
GTGTGAGGGA GAGAACCAG C ACTTCTGGGA CGCACAGAGA CCGTGAAGAT TCCAAGTTCA 
GGAGAACTCG ACCGTTGGAA TGCCAAGATG CCTTGGAAAC AGCAGCCCGA GCCGAGGGCC 
TCTCTCTTGA TGCCTCCATG CATTCTCAGC TCAGAA7CCT GGATGAGGAG CATCCCAAGG 
GAAAGTACCA TCATGGCTTG AGTGCTCTGA AGCCCATCCG GACTACTTCC AAACACCAGC 
ACCCAGTGGA CAATGCTGGG CTTTTTTCCT GTATGACTTT TTCGTGGCTT TCTTCTCTGG 



304 
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CCCGTGTGGC CCACAAGAAG GGGGAGCTCT CAATGGAAGA CGTGTGGTCT CTGTCCAAGC 540 

ACGAGTCTTC TGACGTGAAC TGCAGAAGAC TAGAGAGACT GTGGCAAGAA GAGCTGAATG 600 

AAGTTGGGCC AGACGCTGCT TCCCTGCGAA GGGTTGTGTG GATCTTCTGC CGCACCAGGC 660 

TCATCCTGTC CATCGTGTGC CTGATGATCA CGCAGCTGGC TGGCTTCAGT GGACCAGCCT 72 0 

TCATGGTGAA ACACCTCTTG GAGTATACCC AGGCAACAGA GTCTAACCTG CAGTACAGCT 780 

TGTTGTTAGT GCTGGGCCTC CTCCTGACGG AAATCGTGCG GTCTTGGTCG CTTGCACTGA 840 

CTTGGGCATT GAATTACCGA ACCGGTGTCC GCTTGCGGGG GGCCATCCTA ACCATGGCAT 900 

TTAAGAAGAT CCTTAAGTTA AAGAACATTA AAGAGAAATC CCTGGGTGAG CTCATCAACA 960 

TTTGCTCCAA CGATGGGCAG AGAATGTTTG AGGCAGCAGC CGTTGGCAGC CTGCTGGCTG 1020 

GAGGACCCGT TGTTGCCATC TTAGGCATGA TTTATAATGT AATTATTCTG GGACCAACAG 1080 

GCTTCCTGGG ATCAGCTGTT TTTATCCTCT TTTACCCAGC AATGATGTTT GCATCACGGC 1140 
TCACAGCATA TTTCAGGAGA AAATGCGTGG CCGCCACGGA T 



AGAGTGTTCA AAAAATCCGC GAGGAGGAGC GTCGGATATT GGAAAAAGCC GGGTACTTCC 
AGGGTATCAC TGTGGGTGTG GCTCCCATTG TGGTGG1 

3ATCTGA CAGCAGCACA G 

C CATGACTTTT GCTTTGAAAG TAACACCGTT TTCAGTAAAG TCCCTCTCAG 1500 

AAGCCTCAGT GGCTGTTGAC AGATTTAAGA GTTTGTTTCT AATGGAAGAG GTTCACATGA 1560 

TAAAGAACAA ACCAGCCAGT CCTCACATCA AGATAGAGAT GAAAAATGCC ACCTTGGCAT 162 0 

GGGACTCCTC CCACTCCAGT AT CCAGAACT CGCCCAAGCT GACCCCCAAA ATGAAAAAAG 1680 

ACAAGAGGGC TTCCAGGGGC AAGAAAGAGA AGGTGAGGCA GCTGCAGCGC ACTGAGCATC 174 0 

AGGCGGTGCT GGCAGAGCAG AAAGGCCACC TCCTCCTGGA CAGTGACGAG CGGCCCAGTC 1800 

CCGAAGAGGA AGAAGGCAAG CACATCCACC TGGGCCACCT GCGCTTACAG AGGACACTGC 1860 

ACAGCATCGA TCTGGAGATC CAAGAGGGTA AACTGGTTGG AATCTGCGGC AGTGTGGGAA 1920 

GTGGAAAAAC CTCTCTCATT TCAGCCATTT TAGGCCAGAT GACGCTTCTA GAGGGCAGCA 198 0 

TTGCAATCAG TGGAACCTTC GCTTATGTGG CCCAGCAGGC CTGGATCCTC AATGCTACTC 204 0 

TGAGAGACAA CATCCTGTTT GGGAAGGAAT ATGATGAAGA AAGATACAAC TCTGTGCTGA 2100 

ACAGCTGCTG CCTGAGGCCT GACCTGGCCA TTCTTCCCAG CAGCGACCTG ACGGAGATTG 2160 

GAGAGCGAGG AGCCAACCTG AGCGGTGGGC AGCGCCAGAG GATCAGCCTT GCCCGGGCCT 2220 

TGTATAGTGA CAGGAGCATC TACATCCTGG ACGACCCCCT CAGTGCCTTA GATGCCCATG 2280 

TGGGCAACCA CATCTTCAAT AGTGCTATCC GGAAACATCT CAAGTCCAAG ACAGTTCTGT 2340 

TTGTTACCCA CCAGTTACAG TACCTGGTTG ACTGTGATGA AGTGATCTTC ATGAAAGAGG 2400 

GCTGTATTAC GGAAAGAGGC ACCCATGAGG AACTGATGAA TTTAAATGGT GACTATGCTA 2460 

CCATTTTTAA TAACCTGTTG CTGGGAGAGA CACCGCCAGT TGAGATCAAT TCAAAAAAGG 252 0 

TCACAAG ACAAGGGTCC TAAAACAGGA TCAGTAAAGA 2580 




GGTTGAGTTA CTGGATCAAG CAAGGAAGCG GGAACACCAC TGTGACTCGA GGGAACGAGA 2820 
CCTCGGTGAG TGACAGCATG AAGGACAATC CTCATATGCA G 
CCCTCTCCAT GGCAGTCATG CTGATCCTGA AAGCCATTCG A 
GCACCCTGCG AGCTTCC 

CTATGAAGTT TTTTGACACG ACCCCCACAG G 

TCGTGTTCTT CTGTGTGGGA ATGATCGCAG GAGTCTTCCC C7GGT7C 
GGCCCCTTGT CATCCTCTTT TCAGTCCTGC ACATTGTCTC C 
TGAAGCGTCT GGACAATATC ACGCAGTCAC C 
AGGGCCTTGC CACCATC 
AGCTGCTGGA TGACAACCAA GC1 
CTGTGCGGCT GGACCTCATC AGCATCGCCC T 

TTATGCACGG GCAGATTCCC CCAGCCTATG CGGGTCTCGC CATCTCTTAT GCTGTCCAGT 
TAACGGGGCT GTTCCAGTTT ACGGTCAGAC TGGCATCTGA GACAGAAGCT CGATTCACCT 

CGGTGGAGAG GATCAATCAC TACATTAAGA CTCTGTCCTT GGAAGCACCT GCCAGAATTA 3 660 

AGAACAAGGC TCCCTCCCCT GACTGGCCCC AGGAGGGAGA GGTGACCTTT GAGAACGCAG 3720 

AGATGAGGTA CCGAGAAAAC CTCCCTCTTG TCCTAAAGAA AGTATCCTTC ACGATCAAAC 3 780 

CTAAAGAGAA GATTGGCATT GTGGGGCGGA CAGGATCAGG GAAGTCCTCG CTGGGGATGG 3 840 

CCCTCTTCCG TCTGGTGGAG TTATCTGGAG GCTGCATCAA GATTGATGGA GTGAGAATCA 3 900 

GTGATATTGG CCTTGCCGAC CTCCGAAGCA AACTCTCTAT CATTCCTCAA GAGCCGGTGC 3960 

TGTTCAGTGG CACTGTCAGA TCAAATTTGG ACCCCTTCAA CCAGTACACT GAAGACCAGA 4020 

TTTGGGATGC CCTGGAGAGG ACACACATGA AAGAATGTAT TGCTCAGCTA CCTCTGAAAC 4080 

TTGAATCTGA AGTGATGGAG AATGGGGATA ACTTCTCAGT GGGGGAACGG CAGCTCTTGT 4140 

GCATAGCTAG AGCCCTGCTC CGCCACTGTA AGATTCTGAT TTTAGATGAA GCCACAGCTG 4200 

CCATGGACAC AGAGACAGAC TTATTGATTC AAGAGACCAT CCGAGAAGCA TTTGCAGACT 4260 

GTACCATGCT GACCATTGCC CATCGCCTGC ACACGGTTCT AGGCTCCGAT AGGATTATGG 4320 

TGCTGGCCCA GGGACAGGTG GTGGAGTTTG ACACCCCATC GGTCCTTCTG TCCAACGACA 4380 

GTTCCCGATT CTATGCCATG TTTGCTGCTG CAGAGAACAA GGTCGCTGTC AAGGGCTGAC 4440 

TCCTCCCTGT TGACGAAGTC TCTTTTCTTT AGAGCATTGC CATTCCCTGC CTGGGGCGGG 4500 

CCCCTCATCG CGTCCTCCTA CCGAAACCTT GCCTTTCTCG ATTTTATCTT TCGCACAGCA 4560 

GTTCCGGATT GGCTTGTGTG TTTCACTTTT AGGGAGAGTC ATATTTTGAT TATTGTATTT 4 620 

ATTCCATATT CATGTAAACA AAATTTAGTT TTTGTTCTTA ATTGCACTCT AAAAGGTTCA 4 680 

GGGAACCGTT ATTATAATTG TATCAGAGGC CTATAATGAA GCTTTATACG TGTAGCTATA 4 740 

TCTATATATA ATTCTGTACA TAGCCTATAT TTACAGTGAA AATGTAAGCT GTTTATTTTA 4800 

TATTAAAATA AGCACTGTGC TAATAACAGT GCATATTCCT TTCTATCATT TTTGTACAGT 4860 

TTGCTGTACT AGAG AT CTGG TTTTGCTATT AGACTGTAGG AAGAGTAGCA TTTCATTCTT 4 920 



CTGTCCTGGT GTCACTTACT GTTTCTGTCA GGAGAGCAGC GGGGCGAAGC CCAGGCCCCT 5160 

TTTCACTCCC TCCATCAAGA ATGGGGATCA CAGAGACATT CCTCCGAGCC GGGGAGTTTC 5220 

TTTCCTGCCT TCTTCTTTTT GCTGTTGTTT CTAAACAAGA ATCAGTCTAT CCACAGAGAG 52 80 

TCCCACTGCC TCAGGTTCCT ATGGCTGGCC ACTGCACAGA GCTCTCCAGC TCCAAGACCT 5340 



C CTCCACAGTT CAGTGGCAGG GCTCAGGATT TCGTGGGTCT GTTTTCCTTT 5460 

CTCACCGCAG TCGTCGCACA GTCTCTCTCT CTCTCTCCCC TCAAAGTCTG CAACTTTAAG 5520 

CAGCTCTTGC TAATCAGTGT CTCACACTGG CGTAGAAGTT TTTGTACTGT AAAGAGACCT 5580 

ACCTCAGGTT GCTGGTTGCT GTGTGGTTTG GTGTGTTCCC GCAAACCCCC TTTGTGCTGT 5640 

GGGGCTGGTA GCTCAGGTGG GCGTGGTCAC TGCTGTCATC AGTTGAATGG TCAGCGTTGC 5700 
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ATGTCGTGAC CAACTAGACA TTCTGTCGCC TTAGCATGTT TGCTGAACAC CTTGTGGAAG 57 60 
CAAAAATCTG AAAATGTGAA TAAAATTATT TTGGATTTTG TAAAAAAAAA AAAAAAAAAA 5820 
AAAAAAAAAA AAAAAAAA 



I I I I I I 

MKDIDIGKEY IIPSPGYRSV RERTSTSGTH RDREDSKPRR TRPLECQEAL ETAARAEGLS 60 

LDASMHSQLR ILDEEHPKGK YHHGLSALKP IRTTSKHQHP VDNAGLFSCM TFSWLSSIAR 120 

VAHKKGELSM EDVWSLSKHE SSDVNCRRLE RLWQEELNEV GPDAASLRRV VKIFCRTRLI 18 0 

LSIVCLMITQ LAGPSGPAPM VKHLLEYTQA TESNLQYSLL LVLGLLLTEI VRSWSLALTW 240 

ALNYRTGVRL RGAILTMAFK KILKLKNIKE KSLGELINIC SNEGQRMFEA AAVGSLLAGG 300 

PWAILGMIY NVIILGPTGP LGSAVPILPY PAMMFASRLT AYFRRKCVAA TDERVQKMNE 360 

VLTYIKFIKM YAWVKAFSQS VQKIREEERR ILEKAGYFQG ITVGVAPIW VIASWTFSV 420 

HMTLGFDLTA AQAFTWTVF NSMTFALKVT PFSVKSLSEA SVAVDRFKSL FLMEEVHMIK 480 

NKPASPHIKI EMKNATLAWD SSHSSIQNSP KLTPKMKKDK RASRGKKEKV RQLQRTEHQA 540 

VLAEQKGHLL LDSDERPSPE EEEGKHIHLG HLRLQRTLHS IDLEIQEGKL VGICGSVGSG 600 

KTSLISAILG QMTLLEGSIA ISGTFAYVAQ QAWILNATLR DNILFGKEYD EERYNSVLNS 660 

CCLRPDLAIL PSSDLTEIGE RGANLSGGQR QRISLARALY SDRSIYILDD PLSALDAHVG 720 

NHIFNSAIRK HLKSKTVLFV THQLQYLVDC DEVIPMKEGC ITERGTHEEL MNLNGDYATI 780 

FNNLLLGETP PVEINSKKET SGSQKKSQDK GPKTGSVKKE KAVKPEEGQL VQLEEKGQGS 84 0 

VPWSVYGVYI QAAGGPLAFL VIMALFMLNV GSTAFSTWWL SYWIKQGSGN TTVTRGNETS 900 

VSDSMKDNPH MQYYASIYAL SMAVMLILKA IRGWPVKGT LRASSRLKDE LFRRILRSPM 960 

KFFDTTPTGR ILNRFSKDMD EVDVRLPPQA EMPIQNVILV FFCVGMIAGV FPWPLVAVGP 1020 

LVILFSVLHI VSRVLIRELK RLDNITQSPF LSHITSSIQG LATIHAYNKG QEFLHRYQEL 1080 

LDDNQAPPPL PTCAMRWLAV RLDLISIALI TTTGLHIVLM HGQIPPAYAG LAISYAVQLT 1140 

GLFQFTVRLA SETEARPTSV ERINHYIKTL SLEAPARIKN KAPSPDWPQE GEVTPENAEM 1200 

RYRENLPLVL KKVSFTIKPK EKIGIVGRTG SGKSSLGMAL FRLVELSGGC IKIDGVRISD 1260 

IGLADLRSKL SIIPQEPVLP SGTVRSNLDP FNQYTEDQIW DALERTHMKE CIAQLPLKLE 1320 

SEVMENGDNF SVGERQLLCI ARALLRHCKI LILDEATAAM DTETDLLIQE TIREAFADCT 13 8 0 
MLTIAHRLHT VLGSDRIMVL AQGQWEFDT PSVLLSNDSS RFYAMFAAAE NKVAVKG 
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AGCAGTTGCA CAACTTCCAG CAACTTTCTC AGCCGGCTAC TAATGAGCTG AAAGCCAGGA 6 0 

ACATCCGAGG AGAAGAGAAA GCTTCCAGCC CTCCTCCCTT CACCCTGGAA A7CCAGACAC 12 0 

CCCCACCCCC ACCCTCAGAT CACTTTAAGA TAATTTCTTT ATTCGTTTGC CCGACAGACC 180 

ATGGCTCCCT TTGGAAGAAA CTTGCTAAAG ACTCGGCATA AAAACAGATC TCCAACTAAA 240 

GACATGGATT CAGAAGAGAA GGAAATTGTG GTTTGGGTTT GCCAAGAAGA GAAGCTTGTC 300 

TGTGGGCTGA CTAAACGCAC CACCTCTGCT GATGTCATCC AGGCTTTGCT TGAGGAACAT 3 60 

GACGCTACGT TTGGAGAGAA ACGATTTCTT CTGGGGAAGC CCAGTGATTA CTGCATCATA 42 0 

GAGAAGTGGA GAGGCTCCGA AAGGGTTCTT CCTCCACTAA CTAGAATCCT GAAGCTTTGG 480 

AAAGCGTGGG GAGATGAGCA GCCCAATATG CAATTTGTTT TGGTTAAAGC AGATGCTTTT 54 0 

CTTCCACTTC CTTTGTGGCG GACAGCTGAA GCCAAATTAG TGCAAAACAC AGAAAAATTG 600 

TGGGAGCTCA GCCCAGCAAA CTACATGAAG ACTTTACCAC CAGATAAACA AAAAAGAATA 6 60 

GTCAGGAAAA CTTTCCGGAA ACTGGCTAAA ATTAAGCAGG ACACAGTTTC TCATGATCGA 72 0 

GATAATATGG AGACATTAGT TCATCTGATC ATTTCCCAGG ACCATACTAT TCATCAGCAA 780 

GTCAAGAGAA TGAAAGAGCT GGATCTGGAA ATTGAAAAGT GTGAAGCTAA GTTCCATCTT 84 0 

GATCGAGTAG AAAATGATGG AGAAAACTAT GTTCAGGATG CATATTTAAT GCCCAGTTTC 900 

AGTGAAGTTG AGCAAAATCT AGACTTGCAG TATGAGGAAA ACCAGACTCT GGAGGACCTG 960 

AGCGAAAGTG ATGGAATTGA ACAGCTGGAA GAACGACTGA AATATTACCG AATACTCATT 102 0 

GATAAGCTCT CTGCTGAAAT AGAAAAAGAG GTAAAAAGTG TTTGCATTGA TATAAATGAA 1080 

GATGCGGAAG GGGAAGCTGC AAGTGAACTG GAAAGCTCTA ATTTAGAGAG TGTTAAGTGT 114 0 

GATTTGGAGA AAAGCATGAA AGCTGGTTTG AAAATTCACT CTCATTTGAG TGGCATCCAG 12 0 0 

AAAGAGATTA AATACAGTGA CTCATTGCTT CAGATGAAAG CAAAAGAATA TGAACTCCTG 12 60 

GCCAAGGAAT TCAATTCACT TCACATTAGC AACAAAGATG GGTGCCAGTT AAAGGAAAAC 1320 

AGAGCGAAGG AATCTGAGGT TCCCAGTAGC AATGGGGAGA TTCCTCCCTT TACTCAAAGA 13 80 

GTATTTAGCA ATTACACAAA TGACACAGAC TCGGACACTG GTATCAGTTC TAACCACAGT 144 0 

CAGGACTCCG AAACAACAGT AGGAGATGTG GTGCTGTTGT CAACATAGTT CCAATGGCTC 1500 

CTTTCTGACC TGCTTTCATG TTTTAATGTT TGTTTAATTT AATAGGAAAC CTCATTTTAA 15 6 0 

ATATAACACT CAAAAAAATG TAAATCATAT TGTAGTATTC AATAGTTAAT AAAAACTCGA 162 0 
GAAATGTGTT GTTTCTG 



MAPFGRNLLK TRHKNRSPTK DMDSEEKEIV VWVCQEEKLV CGLTKRTTSA DVIQALLEEH 
EATFGEKRFL LGKPSDYCII EKWRGSERVL PPLTRILKLW KAWGDEQPNM QFVLVKADAF 
LPVPLWRTAE AKLVQNTEKL WELSPANYMK TLPPDKQKRI VRKTFRKLAK IKQDTVSHDR 
DNMETLVHLI ISQDHTIHQQ VKRMKELDLE IEKCEAKFHL DRVSNDGE-MY VQDAYLMPSF 
SEVEQNLDLQ YEENQTLEDL SESDGIEQLE ERLKYYRILI DKiiSAEIEKE VKSVCIDINE 
DAEGEAASEL ESSNLESVKC DLEKSMKAGL KIHSHLSGIQ KEIXYSDSLL QMKAKEYELL 
AKEFNSLHIS NKDGCQLKEN RAKESEVPSS NGEIPPFTQR VFSMYTNDTD SDTGISSNHS 
QDSETTVGDV VLLST 



306 
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GGCGCGTGCG TGCGTGTGTG 
AGTTAGAGTC CCAACTCTTG 
3 TAGTGGGCGT 



50 
55 
60 



TGCGGGTGTG 
TGCGCGCGCT 
GACTCCATTT 



AGAGCAAGAG GAGGACATGG 
CCCGGAGGAG 
AGGCCTGAAT 
TTCGCTGGCC CGGCTTCCCA 
AATTTCTGGA 



TTGGGGAAGG 
TAGTCCTTGA 



GTGTGTGTAT GTGTGTGTGT ATGTGTGTGC 
AGTGTGTGGA CAAGGAGGTG GGGGCAGCTG 
GCTATTCTCT TCTTTCTCCC CCACACCTAT 
GTTCCTTTTC ATTCATTTCT AAATCTCTTA 



TTTTGAACTA 
GCCGGACTCT 
AAATGAAGCT 
TGAGGATGAG 
AGTGGGCCTC 
TGTTGAAGAA 
GAAACGAGAT 
TTCTCTAATG 
AGCTATCCCT 
TTGCCTTATC 
ATTTTCATTG 
TTCAAGCTTA 
GGTGTAGTAT 
AATATAAAAT 
GGAATCTTTA 
TTTGGTTGCT 
TTCAGCAACA 
TAGTGGTGAC 
CATGCAGGTG 
GTTTTTAGGT 



GACCTGTTTA 
CTGCAGCAAA 
GAAGAGGAGG 



ACTGTGAGAT CACAAACCTG 



GATGAAGATG 



AAGGATATGA 
AAGATGAAGC 
TGAAAGAAGA 
AGGAAGAAGA 
ATGGAGAGGA 



ACAGAAGATA 
ATTCCAAATA 
ACTCCCATTG 
CTGTATTAGT 
GGTGCATTTC 
GACAATAGTC 
GTTTGAAGGC 
TTTTTGTCAC 



ATGTGTAACT 
AGAACTAGTC 
TGGAATTCCC 
CATTTTTAGC 
ATTCCTTATT 
TCTTGAGTGG 
AGTTACCTTT 
AAGTAACTTG 
ATTTTGTGGA 
TTATTTTTGA 



GAAGATTAAC CTGGAGTTAA GGAACAGATC 360 
TAATTGCCTG TGTGTCAATG G 
ATTTCTGAGT ATGGCTAATG T 
ACTTCGAAAA TTGGAG( 
GAAATGTCCA AATCTTACCT A 



1260 
1380 



TGGAGATGAA 



AGGTTCAGAG 
AATTCAGGAT 
GGAAGAAGGA 
AGAAGATGAC 
TGATCACATC 
TTTTATAGGA 
TGTTAATGAT 
TAGCAATTTA 
CCATAATTAA 
TATAGATTAA 
TAAGTTGGTT 



GAAGATTATA 
GATCAGGAGG 
GATGATGAAG 
GAGGAAGAGG 



GAAGAAGATG 
GGTCTTCGAG 
TAGATCATTC 



AAAGTGTGGT 
CATATTGTAT 
TTTAGACTTA 
AACATGATCA 
CTGAAATTAC 
ATTTTTTTAG 



GAGAAGAGGA 
ATGATGACTA 
GGGAGAAGAG 
TAAGACCAGA 
CATGTACGAT 
TTTACTATTT 
G-AGAGAAAA 



CTTTAGTCTT 
AAGAAATTGA 
TCTCCTTACC 
AAAGAAAAGC 
GCCTGTTGAT 
TTGGATACCA 



ATAAATGTTT 
CTCTGCAAAG 



CTCAAAAGGA 
TACGTTTCAG 
ACATAAGAAT 

TTGGAAGCAG 



GAAAATAGAA GCAGAATAGT 
GGTTCTATTC AGTAATATGG 
AGGAAAAATT GCTTATACTA 
AATCATGACA TGCCGATGGT 
GCAATTCCTT TATGATCACC 
ACTTGGGAAA CTTGTGAAAC 
TCCTGTCTGG GTTTTAAATA 



ACTAAGAGTG 
AAAGGTTCTA 
TTCATGGATT 
AGTCCAGAGA 
TGTTTATTTT 



A GATTTGTTTT 
AGTGAGTGTG TGTTTCCTTA 
CTCCTCTTTT GCTATGGAGG 



TACTGCTTCT 
CTTGTGGAGA 
TAGTGAGGGC 
ATTCCCTGTT 
ATTTTTAAAA 
ATTAGAGTAC 
TTAGCTTTGT 



TTATGCTTCT 
ACAATGTAAA 
AACTAAAAAT 



CTAGTTCCTA 
TTTGGTTTGA 



Seq ID NO: 323 I 



MEMKKKINLE 



TTGAATATTT 
ATACTGGGAC 
ATGTATGACT 
GTTGTACAAC 
CCAAAAAGAT 
ACAATTCTCA 
ATATTGACTT 
AAACAACACT 
TTCATGCAAA 
TGCTCTTTGT 
CTTTGAAGTT 

i sequence: 



I 



CTTTAATTCC CAGTTTTAAA ACAGA7ATAA 
GCTGGAAAAG TA7TTGAAAC TAAAT7GACA 
TCCATTAAAA GTAGAAAAAT ATTTGGGATA 
AAATAAAATA TAATGAG7AT ACAAGTATAT 
AAGGCAATGG CTTTTTAAAT CTTGGCTATC 
AAGAAGTTAG T 



TTACATACTA TGCCAGATTA 
TCTTATCAAT CATCTTACTG TGCAATCAAA 
TAGAGCCTCC AGATAACTTT TAAGACTTAT 
TAAGTAAGGG TGGGTTTTAT ATTTTGTAGA 
ATGGCAGTAT GTATATATTG TGTTAAGTTC 
AATACTTTTG TGCAACTGTG TTTTGAATAA 



3120 
3180 
3240 



I 



T EIiVLDNCLCV NGEIEGLNDT FKELEFLSMA NVELSSLARL 
HISGGL EVLAEKCPNL TYLNLSGNKI KDL5TVEALQ NLKNLKSLDL 
FNCEITNLED YRESIFELLQ QITYLDGFDQ EDNEAPDSEE EDD3DGDEDD EEEEENEAGP 
PEGYEEEEEE EEEEDEDEDE DEDEAGSELG EGEEEVGLSY LMK3EI0DEE DDDDYVEEGE 
EEEEEEEGGL RGEKRKRDAE DDGEEEDD 

Seq ID NO: 324 DNA sequence 
Nucleic Acid Accession : NM_003812 
Coding sequence: 224.. 2722 



I 



I 



GCTCTCGCCG 
ACGCGGCCCC 



GGCTGCTGCG 
GGCAGATGAA 
AATGCAGAAA 



GAGTGGCTGC GAGGCTAGGC GAGCCGGGAA A 
: GAGCCCCGCG CCCCGTGCCC CGAGCCCGGA GCCCCCTGCC O 

TGACCGGCTC CGCCCGCGGC CGCCCCGCAG CTAGCCCGGC 
GCGGCGCCCG GGAGCTATGA GCCATGAAGC CGCCCGGCAG 
TGGCGGGCTG CAGCCTTGCC GGCGCTTCCT GCGGCCCCCA 
TGCCTGCCAG CGCCCCGGCC CGCACGCCGC CCTGCCGCCT 
CTTCTCCTGC TGCCTCCGCT CGCCGCCTCG TCCCGGCCCC G 
CCCAGCGCTC CGCATTGGAA TGAAACTGCA & 

GACAATACAT TGCAACAGAA TAGCAGCAGT AATATCAGTT ACAGCAATGC 
GAAATCACAC TGCCTTCAAG ACTCATATAT TACXTCAACC AAGACTCGGA 



307 



25 
30 
35 



WO 02/086443 

AAGCCCTTAT CACGTTCTTG 
CCATCTGGCC CAGGCAAGCT 
CATACTGAAC AATGGTTTGT 
ACCACAGTAC TCTAAGGGTG 
AGACTCCAAG GTGGCTCTGT 
CTTCGTGTAT ATGATAGAGC 
ACATATAATC CAGAAAACCT 



ACACAAAGGC 
TCCAGATTGA 
TGTCTTCTGA 
GAGAGCACTG TTACTACCAT GGAAGCATCA 
CAACCTGCAA 
CACTAGAGCT 



Z AAGACACCAG 
H AGCCTTCGGC 
A TTATGTGGAG 



CAAAAACATA 
TCCAAATTCA 
ATTCACTACG 



GGAAAGAGGT GACCAGTGGC 
AGCAGTGAAT CCATCACGTG 
TAATGATCAC AAAACGTATA 



CCTGGTGGCT 
GCAGATGCTC CATGAGTTCT 
GCACCTCATC TCGCGGGTGA 
TGTCTGTTCT CGCACAAGAG 
ACAAGTATTA TCGCAGAGCC 
AAAGCCAAAA TGTGACTGCA 
GTCCCATTCT CGAAAATTTT 
A GCCTGCCTTT 



GTATATTTGA 
AGAAGCATCG 
TGGATTCTAT 
GGACTGAGAA 
CAAAATACCG 



T GGCATGTTTG 
T GAGAAAAGCA 
G CAAATGAAGA 
G TGGTTGAAAA 



GAGGCGTCAA 
AAGATGATAC 
CAGGTCGACC 
ATCTCACTAT 
GAAGGAAGAG 



TGAATTACAG T 
AGAAATGAAA T 
CTCTTCTCAT GCACATACCA ACAACTTTGC 



TAAGAGAAGC 



GACATCACCA CCAACCCTGT 
AAGCAGCATG CTGATGCTGT 
AGTCTGAGTT ACTTTGGAGG 



ATTATGCTGT 
TAACAATACC TCATGTCTTT 
GTGTGATATT ACIGAATATT 



GGGAGGAGTG TGATT 



CGGGGCTCAC 



TTTGAGCCCA O 

AATGCTATGG 
GGCCCTGCTG 
TGCCGGGATG CTGTGAACGA 
TGCCCACCAA ATCTTCATAA 



CAGAGACAAC 
CTATGAAAAG 
GTGGATTCAG 
TCGAGCTCCA 



CAGTGTCAGT 
CTGAATACAG 
TGCAGCAAAC 



CTATGTAGAA 



ACATCTGGGG 
AAGGCACTGA 
ATGATGTGTT 
AACTTCAGGG 
GTGGTGCCCA 



GAAGGGAAAC 



GGGCCATGGG 
AGATTGCAGT 
GGGTCCTAGT 
TATTGTCCTT 
TACTCAGCAA 
ATTCTGGGTA 



GCCCTAAATA 
GTGTGTAGTA 
ATCCGGGATC 
GCCACCAATC 
GGGGGCACAG 



ATGAAGCCAC 
CAGTTAGGAA 
TCATAATAGG 



TCCACTCGAT 
CTGCATTTGT 
CCTTCACCCC 



GCAGGGTCTG A 
TGCGGGAAGG ATGGAGACCG 
TTACTCTGTA CCAATCTTAC 
CCAACTTCCT TCTACCATCA 
GATGATGATA CGGATGTGGG 
TGTTTAGATC GGAAGTGCCT 



GAATCAGCTG 
GCAGCAGTGT 
GGAGCTAAAG 
GTCAAAGAAC 
TAAAAAGAAC 
CAAAAATTAA 



GATTTCACCT GGGCAGGGAC 
CCCAAGGATG AAGGACCCAA 
GGTGCCATCC TGGTAGCAGC 
TCGATCC 



TACTGGAACT 
TTGGGGTGAC 
ACCTTTCACC 
TGTTCCAGAA 
ATGCAATAAA 



ACACCGCCTT GCACTGTTGG 
ATTAAGTTTG TAAACAAAAC 
AAGGATGGGG TAAAAGAAAA 
ACCTGTCAGT AAACGGGGGA 
TCTTTTTTTT TCCCTAATGG 
GGAATCATTA AAAA 



2580 
2760 
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VLDTKARHQQ KHNKAVHLAQ 
KGGEHCYYHG SIRGVKDSKV 
KTLAGQYSKQ MKNLTMERGD 
TYKKHRSSHA HTNNFAKSW 
EFSKYRQRIK QHADAVHLIS 
QSLAQNLGIQ WEPSSRKPKC 



I 

MKPPGSSSRQ PPLAGCSLAG ASCGPQRGPA 
RPRAWGAAAP 
INQDSESPYH 
HYENGKPQYS 
KSTGRPHIIQ 
LELMIVNDHK 
ITTNPVQMLH 
LPMAVAQVLS 
RDFLQRGGGA 

CLFQPRGYEC RDAVNECDIT 
CQYIWGTKAA GSDKFCYEKIi 
IGQLQGEIIP TSFYHQGRVI 
LNMSSCPLDS KGKVCSGHGV 
TNLIIGSIAG AILVAAIVLG 



Seq ID NO; 326 
Nucleic Acid Accession th AK074418.1 
Coding sequence; 244-1515 



I I 

'PPCRLLLVL 

: SYSNAM0KE 
ASFQIEAFGS KFILDLILK1I 
MFBDDTFVYM 
LKRRKRAVNP 
IYKEQ LNTRWLVAV 
YKRSS LSYFGGVCSR 
WGGCI MEETGVSHSR 



NGECKTRDNQ 



SRGIFEEMXY 
ETWTEKDQID 
TRGVGVNEYG 
KFSKCSILEY 
KCSLSNGAHC 
ACNQNQGRCY 



CATGCTCTCC 
CTGTGGAACC TGCCCGTCTC 
AAAAGTGAGG GACCGGTGAG 
AAATGACAGA GCTGGAGTTC TGCTGTGCCT 
GTCATGGCGT ATTACCAGGA GCCTTCAGTG 
GACTTTACCA CCTTGCGGGA TCACTGCCTG 
TTCCCCGCAG CAGATTCTTC CATAGGCCAG 



CCCTCCTCCA 
TGACTTGCTG 
GGAAAGGACC 
GAGACCTCCA 
AGCATGGGCC 
AAGCTGCTCC 



TCAGACACAC 



TCATCAAGTT 
GGACGTTTAA 
AGGAAAAACG 
CTCACTTCAT 



CTGCCTAGGA 12 0 

TACCAGATGC 180 

TCTAAGGAGA 24 0 

CAAAGACCAG 300 

GGATGAGACA 360 

CCTCTCCAAT 420 



GTGATTGATG 
CAAAACCAAG 
TATTCCGATC 
ATCACCAACA 
ACCAAGGCAG 
GCGATGGAGA 
CAATACCGAA 
ACCGAATGGA 



ACAGTACAGG 
CCGTTTCCGG 
TGTCCAGGGA 



C GCAGCTGACT GCTGGTTCCT GGCAGCACTG 540 



CAGAAGATCC 
TTCTGGCAAT 
GATAAATGCC 



TCCATCTGCA 



ATGGGCTGGT GAGTCTCCAT 
GGGGCTGGGA AGAAATTATC 
GAGGGCGCTG 



GATGCCCTGG 
GTGGACCTGG 
ACTCCAAGTG 
GCCTACACTG 
TCCCIGTGGA 



TGATGGTCCA 
GTGGCCAGTG 
TCTTTGTGCG 
ATGCCAAGCT 
TGGACCTCAC 
TGAAGGCAGT 



GGTGGAAGTG 
TCCTCGCCAC 
GCTCGGATCC 



GAAGACAGCG 900 



308 



WO 02/086443 

CGGAAAAGCC AGCTACATAA 
TTCCAACAGA AATTCATCGC 
GGAAACACAC TCCACGAAGG 
AACACTGCAG GAGGACCTCG 
GAAGGCACCA ATGTTGTCGT 
GAAGATGCAA AATTTCCACT 
CCAAAGCTCA AATAATAAAT 
GAACTATGTT GTGGTTGCAC 
CCTGAAAATG CCAGACAGTG 
AAGCCCTTCA GAACATGGCT 



GAAACGGGAA 
CATGTTTATA 
ATGGTCCCAA 
GAATGATGCT CAATTCAACT 



TTTGGATGTC 
TTCCAATTAC 
GGAAGCAAGT 



CCCTCACAGG CCCTTACTGG 
GCCTCTCTTC CTGGATCGTC 
CGCCCCACCC AGTCTCATCC 
GGATAATTAT GGGGTGTGAG 
ACCCCGTGAA ACCTTTCCTT 
CCCGGGAGCT AGCCAGCTTC 
TGCACACAGG ATTTCCTTAA 
GAATAAAATA GCTGCCAGGG 
1 AAAAAAAAAA 



ACAGGCACCT 
CCCAACAAAG 
TTACGTGGGA 
GATGCAGAGA 
TCCAGAACTG 
GGGGGACTTC 
GTGCATTGCC 



GTTGCTGTCA 
GTGATTCTGG 
CTTCACCATG 
AAAATCAGCG 
GAGCAGCCAT 
CATTTTCAAC 
TTGGAGAAAG 



CACCATCAAA 
CTGGCTCACA 
ACTTACCATC 
GAGTTCTTGC 



AGATATGCTC 
GGGACCTGAG 
( i rGATGGAc 
CAAGCTCGGT 
GCAGAGCTTA 



GTGTCAAGAT 
CCTGGACCAT 
GATTCTAGGA 
AGAGCCAATG 
TTTGAAAGCA 
GAAACACTGT 
TGAGCCCTGG 
TCCGAATCTT 
GAATGAAGGG 



PCT/US02/12476 



AACTGTTATA 
GCTCTGCACA ATGAGCCTCT 
AAAAAAAAAA 



ACCAACCTGG 
TTGCTGCCAA 
AAGAACTCCT 



AATTGGCAGT 
CATCGTTCCT 
ACCACCTATG 



1 it: BAB85075.1 



I 



I 



I 



I 



I 



35 
40 
45 
50 
55 
60 
65 
70 
75 



MAYYQEPSVE TSI IKFKDQD FTTLRDHCLS MGRTFKDETF PAADSSIGQK LLQEKRLSNV 
IWKRPQDLPG GPPHFILDDI SRFDIQQGGA ADCWFLAALG SLTQNPQYRQ KILMVQSFSH 

QYAGIFRFRF WQCGQWVEW IDDRLPVQGD KCLFVRPRHQ NQSFWPCLLE KAYAKLLGSY 

SDLHYGFLED ALVDLTGGVI TNIHLHSSPV DLVKAVKTAT KAGS&ITCAT PSGPTBTAQA 

MENGLVSLHA YTVTGAEQIQ YRRGWEEIIS LWNPWGWGET EWRGRWSDGS QEWEETCDPR 

KSQLHKKRED GEFWMSCQDF QQKFIAMFIC SEIPITLDHG NTLHEGWSQI MFRKQVILGN 

TAGGPRNDAQ FNFSVQEPME GTNWVCVTV AVTPSNLKAE DAKFPLDFQV ILAGSQKHCP 



Seq ID NO: 328 DMA sequence 

Nucleic Acid Accession #-. EC017490.1 

Coding sequence; 74-2788 



I [■ -i CGCGAA 
AATCATCGGA 
CTCTCACCTC 



CATGCGGCAG 
GTATGACAGC 
GGCCACGGAG 
TCTCAAAGGC 
CCACCGCTTC 
GGAGCGCATC 



TGAACCACTT 
GCTATGGCGG 
A GGCAATGATC 
C TCCAGCCCTG 
A GAGGGGCCCC TGGAGGAAGA A 
G GACTACCGCG CCATCCCAGA G 
3 AGGAGCTGAC 
3 AGGCTGGCCG 



ATCCTTCACC 



TGCTGTAGTG 
ATGGCATCCA 
CGAAGCTCCC 
GAGGATGAGT 
GAGGAGCTCA 
TATGAGGCCG 
AGGGAGGCAG 



GGCGTACTGA 
CCGAGGGGCT 

TTGGAGATGG 




GCAGATCTTT 
CATCACCAAC 
GCTGAGGCAG 
TGGCGTCCTG 
GGGTCCTTTC 



CACTCTGTGC 
AAGAACTTCC 
AGCGACATGT 
AGGGAGCACG 
GATGAGGCTG 



CTGCATCTGA 
CCCCAGCTCA 
TGCCAGTCCC 
CCCTTTGAGG 



TCCGCATCTC 



GGTACTGGCC 



C GCAGATCTGG T 



CACTGTCATC 
GACCGATGAA 
GATCTTTGCC 
TCTGGCCCTG 
TATCAACGTG 
TGAGAAAGTG 



Z ACGTGGCCAA 



TTCGGAGGGG 



CTTCCATCTA 
AGCCCAAAAA 
GAGACCCTGG 



GGAGACCATC 
GGCTGGCCGG 
CAAGCCAGGA 
CAACACTGCC 
GAAGGACAAC 
CCTCTCCAAG 



ATGTACCCCA AGTACGACCG 
CTGGTGGAGG AGCTGCGCTC 
GGGGTGGTGA CCAGCTGCAC 
AACAAGTGCA ATTTCGTCCT 
CTGAGTGCCA 
ACCAGCGTAI 
CTGCCCCGCT CCAAGGACGC 
AGCTGACTGG 



TAGGGGAACT 



CCCAGGTGGC 
CACAGCGAAG 
CACTGGCCAG 



GACATCAAGA GAGGCCTGGC 
AAGCACAAGG TACGTGGTGA 
TCGCAGTTTC TCAAGTATAT 



GGTTCTGGCT 
CAGAACCAGC 
CGTCACCTCC 
CTACGACCCC 
CTTTGACATC 



CCAGGAGGTC 
CCAGATGGAC 
GACAGGCAGC 



GACTTTTGCC 



GACCGAGGAG TGTGTCTCAT TGATGAATTT 
ACAGAGCATC 
CATTGCTGCC 
CGTGGACCTC 
CIGTGTGTGG TGAGGGACAC CGTGGACCCA 
GTGGGCAGCC ACGT CAGACA CCACCCCAGC 
AGCGCTGCTG AGCCCGCCAT GCCCAACACG 
CTCAAGAAGT 
CAGGACAAGG 
ATCCCCATTA 
ATCCATCTGC 

CGCTACCTTT 
GTGGCAGAGC 
GTCCCTGAGA 



ACCTTGGAGG 
GACAAGATGA 
TCCATCTCGA 
GCCAACCCCA 
ACAGAGCCCA 



TAGACACACA 
CATTCCGGCG 
AGGTGACATA 
AGGACTTGGT 



GTACAGTGAC 
CATCGAGTCC 
GATCGAAGAC 



TATGGCGTGG 
AGGGTCCACC 
CTGAGGAAAG 
ATGATCCGCA 



TCATCTCACG 
AGATGCTGGC 
AGGAGGGGCT 
AGCCCCTGCC 
CGAAGCTCAA 
AATCTATGGC 
TGGCGGAGGC 



TGACAACAAT 
TCAGCGCAAC 
GGATAAGGCT 



A GCATGCGCAA 



GAGCTGITGC TCTTCATACT 
CGCTTTGGGG CCCAGCAGGA 
CGTCAGATCA , 



2580 
2640 
2700 



309 



WO 02/086443 PCT/US02/12476 

CCTCTCTGCA TTTTATGACA GTGAGCTCTT CAGGATGAAC AAGTTCAGCC ACGACCTGAA 2 7 60 

AAGGAAAATG ATCCTGCAGC AGTTCTGAGG CCCTATGCCA TCCATAAGGA TTCCTTGGGA 2 82 0 

TTCTGGTTTG GGGTGGTCAG TGCCCTCTGT GCTTTATGGA CACAAAACCA GAGCACTTGA 2 88 0 

TGAACTCGGG GTACTAGGGT CAGGGCTTAT AGCAGGATGT CTGGCTGCAC CTGGCATGAC 2940 

TGTTTGTTTC TCCAAGCCTG CTTTGTGCTT CTCACCTTTG GGTGGGATGC CTTGCCAGTG 3000 

TGTCTTACTT GGTTGCTGAA CATCTTGCCA CCTCCGAGTG CTTTGTCTCC ACTCAGTACC 3 0 60 

TTGGATCAGA GCTGCTGAGT TCAGGATGCC TGCGTGTGGT TTAGGTGTTA GCCTTCTTAC 3120 

ATGGATGTCA GGAGAGCTGC TGCCCTCTTG GCGTGAG7TG CGTATTCAGG CTGCTTTTGC 3180 

TGCCTTTGGC CAGAGAGCTG GTTGAAGATG TTTGTAATCG TTTTCAGTCT CCTGCAGGTT 3 240 

TCTGTGCCCC TGTGGTGGAA GAGGGCACGA CAGTGCCAGC GCAGCGTTCT GGGCTCCTCA 3300 

GTCGCAGGGG TGGGATGTGA GTCATGCGGA TTATCCACTC GCCACAGTTA TCAGCTGCCA 3360 

TTGCTCCCTG TCTGTTTCCC CACTCTCTTA TTTGTGCATT CGGTTTGGTT TCTGTAGTTT 3420 
TAATTTTTAA TAAAGTTGAA TAAAATATAA AAAAAAAAAA AAAAAA 



I I I I I I 

MAESSESFTM ASSPAQRRRG NDPLTSSPGR SSRRTDALTS SPGRDLPPFE D - " 
GPLEEEEDGE ELIGDGMERD YRAIPECjDAY EAEGLALDDE DVEELTASQR EAAERAMRQR 
DREAGRGLGR MRRGLLYDSD EEDEERPARK RRQVERATED GEEDEEMIES I 
SVREWVSMAG PRLEIHHRPK NFLRTHVDSH GHNVPKERIS DMCKENRESL V 
EHVLAYFLPE APAELLQIFD EAALEWLAM YPKYDRITNH IHVRISHLPL VEELRSLRQL 
HLNQLIRTSG WTSCTGVLP QLSMVKYNCN KCNFVLGPFC QSQNQEVKPG SCPECQSAGP 
FEVNMEETIY QNYQRIRIQE SPGKVAAGRL PRSKDAILLA DLVDSCXPGD EIELTGIYHN 
NYDGSLNTAN GFPVFATVIL ANHVAKKDNK VAVGELTDED VKMITSLSKD QQIGEKIFAS 
IAPS1YGHED IKRGLALALF GGEPKNPGGK HKVRGDIHVL LCGDPGTAKS QFLKYIEKVS 
SRAIFTTGQG ASAVGLTAYV QRHPVSREWT LEAGALVLAD RGVCLIDEFD KMNDQBRTSI 
HEAMEQQSIS ISKAGIVTSL QARCTVIAAA NPIGGRYDPS LTFSENVDLT EPIISRFDIL 
CWRDTVDPV QDEMLARFW GSHVRHHPSN KEEEGLANGS AAEPAMPNTY GVEPLPQEVL 
KKYI1YAKER VHPKLNQMDQ DKVAKMYSDL RKESMATGSI PITVRHIESM IRMAEAHARI 
HLRDYVIEDD VNMAIRVMLE SFIDTQKFSV MRSMRKTFAR YLSFRRDNNE LLLFILKQLV 
AEQVTYQRNR FGAQQDTIEV PEKDLVDKAR QINIHNLSAP YDSELFRMNK FSHDLKRKMI 
LQQF 

Seq ID MO: 330 DNA sequence 
Nucleic Acid Accession #: M17254 
Coding sequence: 257-1645 

1 11 21 31 41 51 

I I I I I 



CGGGTCGCAC TAACTCCCTC GGCGCCGACG GCGGCGCTAA CCTCTCGGTT ATTCCAGGAT 12 0 

CTTTGGAGAC CCGAGGAAAG CCGTGTTGAC CAAAAGCAAG ACAAATGAC7 CACAGAGAAA 1B0 

AAAGATGGCA GAACCAAGGG CAACTAAAGC CGTCAGGTTC TGAACAGCTG GTAGATGGGC 24 0 

TGGCTTACTG AAGGACATGA TTCAGACTGT CCCGGACCCA GCAGCTCATA TCAAGGAAGC 30 0 

CTTATCAGTT GTGAGTGAGG ACCAGTCGTT GTTTGAGTGT GCCTACGGAA CGCCACACCT 36 0 

GGCTAAGACA GAGATGACCG CGTCCTCCTC CAGCGACTAT GGACAGACTT CCAAGATGAG 420 

CCCACGCGTC CCTCAGCAGG ATTGGCTGTC TCAACCCCCA GCCAGGGTCA CCATCAAAAT 480 

GGAATGTAAC CCTAGCCAGG TGAATGGCTC AAGGAACTCT CCTGATGAAT GCAGTGTGGC 54 0 

CAAAGGCGGG AAGATGGTGG GCAGCCCAGA CACCGTTGGG ATGAACTACG GCAGCTACAT 600 

GGAGGAGAAG CACATGCCAC CCCCAAACAT GACCACGAAC GAGCGCAGAG TTATCGTGCC 660 

AGCAGATCCT ACGCTATGGA GTACAGACCA TGTGCGGCAG TGGCTGGAGT GCGCGGTGAA 72 0 

AGAATATGGC CTTCCAGACG TCAACATCTT GTTATTCCAG AACATCGATG GGAAGGAACT 78 0 

GTGCAAGATG ACCAAGGACG ACTTCCAGAG GCTCACCCCC AGCTACAACG CCGACATCCT 84 0 

TCTCTCACAT CTCCACTACC TCAGAGAGAC TCCTCTTCCA CATTTGACTT CAGATGATGT 90 0 

TGATAAAGCC TTACAAAACT CTCCACGGTT AATGCATGCT AGAAACACAG ATTTACCATA 960 

TGAGCCCCCC AGGAGATCAG CCTGGACCGG TCACGGCCAC CCCACGCCCC AGTCGAAAGC 102 0 

TGCTCAACCA TCTCCITCCA CAGTGCCCAA AACTGAAGAC CAGCGTCCTC AGTTAGATCC 1080 

TTATCAGATT CTTGGACCAA CAAGTAGCCG CCTTGCAAAT CCAGGCAGTG GCCAGATCCA 114 0 

GCTTTGGCAG TTCCTCCTGG AGCTCCTGTC GGACAGCTCC AACTCCAGCT GCATCACCTG 120 0 

GGAAGGCACC AACGGGGAGT TCAAGAIGAC GGATCCCGAC GAGGTGGCCC GGCGCTGGGG 12 60 

AGAGCGGAAG AGCAAACCCA ACATGAACTA CGATAAGCTC AGCCGCGCCC ICCGTTACTA 132 0 

CTATGACAAG AACATCATGA CCAAGGTCCA TGGGAAGCGC TACGCCTACA AGTTCGACTT 138 0 

CCACGGGATC GCCCAGGCCC TCCAGCCCCA CCCCCCGGAG TCATCTCTGT ACAAGTACCC 1440 

CTCAGACCTC CCGTACATGG GCTCCTATCA CGCCCACCCA CAGAAGATGA ACTTTGTGGC 1500 

GCCCCACCCT CCAGCCCTCC CCGTGACATC TTCCAGTTTT TTTGCTGCCC CAAACCCATA 1560 

CTGGAATTCA CCAACTGGGG GTATATACCC CAACACTAGG CTCCCCACCA GCCATATGCC 162 0 

TTCTCATCTG GGCACTTACT ACTAAAGACC TGGCGGAGGC TTTTCCCATC AGCGTGCATT 1680 

CACCAGCCCA TCGCCAGAAA CTCTATCGGA GAACATGAAT CAAAAGTGCC ICAAGAGGAA 174 0 

TGAAAAAAGC TTTACTGGGG CTGGGGAAGG AAGCCGGGGA AGAGATCCAA AGACTCTTGG 180 0 

GAGGGAGTTA CTGAAGTCTT ACTACAGAAA TGAGGAGGAT GCTAAAAATG TCACGAATAT 1860 

GGACATATCA TCTGTGGACT GACCTTGTAA AAGACAGTGT ATGTAGAAGC ATGAAGTCTT 192 0 

AAGGACAAAG TGCCAAAGAA AGTGGTCTTA AGAAATGTAT AAACTTXAGA GTAGAGTTTG 1980 

AATCCCACTA ATGCAAACTG GGATGAAACT AAAGCAATAG AAACAACACA GTTTTGACCT 2040 

AACATACCGT TTATAATGCC ATTTTAAGGA AAACTACCTG TATTTAAAAA TAGTTTCATA 2100 

TCAAAAACAA GAGAAAAGAC ACGAGAGAGA CTGTGGCCCA TCAACAGACG ITGATATGCA 2160 

ACTGCATGGC ATGTGCTGTT TTGGTTGAAA TCAAATACAT TCCGTTTGAT GGACAGCTGT 2220 

CAGCTTTCTC AAACTGTGAA GATGACCCAA AGTTTCCAAC TCCTTTACAG IATTACCGGG 2280 

ACTATGAACT AAAAGGTGGG ACTGAGGATG TGTATAGAGT GAGCGTGTGA TTGTAGACAG 2340 

AGGGGTGAAG AAGGAGGAGG AAGAGGCAGA GAAGGAGGAG ACCAGGCTGG GAAAGAAACT 2400 

TCTCAAGCAA TGAAGACTGG ACTCAGGACA TTTGGGGACT GTGTACAATG AGTTATGGAG 2460 

ACTCGAGGGT TCATGCAGTC AGTGTTATAC CAAACCCAGT GTTAGGAGAA AGGACACAGC 252 0 

GTAATGGAGA AAGGGAAGTA GTAGAATTCA GAAACAAAAA TGCGCATCTC TTTCTTTGTT 2580 

TGTCAAATGA AAATTTTAAC TGGAATTGTC TGATATTTAA GAGAAACATT CAGGACCTCA 264 0 

G GGGGCTTTGT TCTCCACAGG GTCAGGTAAG AGATGGCCTT CTTGGCTGCC 2700 
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ACAAT CAGAA ATCACGCAGG CATTTTGGGT AGGCGGCCTC CAGTTTTCCT TTGAGTCGCG 2760 

AACGCTGTGC GTTTGTCAGA ATGAAGTATA CAAGTCAATG TTTTTCCCCC TTTTTATATA 282 0 

ATAATTATAT AACTTATGCA TTTATACACT ACGAGTTGAT CTCGGCCAGC CAAAGACACA 2880 

CGACAAAAGA GACAATCGAT ATAATGTGGC CTTGAATTTT AACTCTGTAT GCTTAATGTT 294 0 

TACAATATGA AGTTATTAGT TCTTAGAATG CAGAATGTAT GTAATAAAAT AAGCTTGGCC 3000 

TAGCATGGCA AATCAGATTT ATACAGGAGT CTGCATTTGC ACTTTTTTTA GTGACTAAAG 3060 

TTGCTTAATG AAAACATGTG CTGAATGTTG TGGATTTTGT GTTATAATTT ACTTTGTCCA 312 0 
GGAACTTGTG CAAGGGAGAG CCAAGGAAAT AGGATGTTTG GCACCC 
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10 
15 



25 
30 



I 



I 



I 

TASSSSDYGQ 



MIQTVPDPAA HIKEALSWS EDQSLFECAY 
QDWLSQPPAR VTIKMECNPS QVNGSRNSPD 
PPPNMTTNER RVIVPADPTL WSTDHVRQWL EWAVKEYGLP DVNILLFQNI DGKELCKMTK 
DDFQRLTPSY NADILLSHLH YLRETPLPHL TSDDVDKALQ NSPRLMHARN TDLPYEPPRR 
SAWTGHGHPT PQSKAAQPSP STVPKTEDQR PQLDPYQILG PTSSRLANPG SGQIQLWQFL 
LELLSDSSNS SCITWEGTNG EPKMTDPDEV ARRWGERKSK PNMNYDKLSR ALRYYYDKNI 
MTKVHGKRYA YKFDFHGIAQ ALQPHPPESS LYKYPSDLPY MGSYHAHPQK MNFVAPHPPA 
GGIYPNTRLP 



AGGAAACGGT TTATTAGGAG GGAGTGGTGG AGCTGGGCCA O 
AGAAACATTT TTGCTCCAGC CCCCATCCCA GTCCCGGGAG GCTGCCGCGC CAGCTGCGCC 12 0 



A CGCTGGAATA 




GATGTGGTGC 
CAGACCCCCA 
ATGATGCGGG 
AAGACACTAC 
AGCACCTGAT 
CTATCTGGGT 
TGCTCGGCCC 
GTCTGGCCTG 



CCCTGGCACA 



TCCTCAACAA 
ACTAGGGCAT 
AAAAGGGCAG 
GCCAAGCATG 
TTTGCTCCAT 



CCAATGACCC 
CCATCCCTAA 
AGTGCTGGTA 
AAAAAATTAG 
TCCTTTCTGC 
AGAGGTAGTG 
CCAGCCCACC 
CTCAAAGCGG 
CACCCCCTAC 
AGGGAATCCC 
ACCCCACTGC 
CACTTCCCTG 
TCTCTCTGTG 



CTGCAGGGGG CTGGGGGGGT 
TGAGTGTGGT GTGTGCTGGG 
CAGCCAAAAA TACAGCTGGG 



CAGGCTCCCT G 



GGTGCTGTGG 
ACTATAGACC ACCCTTCTAT 
AGGTGGTGTG T 
TCCTCTCAGG C 
TCACCGCGCT GCGGATCAAG 
AAGTGATTCA ATAGCCCAGG 
GGGGGGCAGT GGATGGTGCC 
GATGGGCAGC TGCGCCTGCC 
CTGAAACCTG ATCCCCTGCT 



CACTCCCGGG A 



AGTCCCAGAC TCAGAGCCCG G 



CCAGGCCTCA 



TAAATCCTAA 



GCAGGGGGAA 
GTGACAAAAG 
GACACGGAGT 
GCAACGTCTA 
TTACAGGCAC 
CCATGCTGGC 
CCCAAAGTGC 
TCTACATATT 
GACACTTCAG 
CCCTGGCAAT 
CTGGAGCACC 
GCCACCCTTG 



TGCTGAATGT 
GAGGTCCTAC 
CAAGGCCCAG 
GGTCAGTGGG 
CAGGCCTGTC 



GCCTCTAGCA 
TCAGCTCCAT 
CAGCTGCCTG 
TGAGGTGTGG 
GACTTTCAGA 



TAAGCTCCAG 
GATGCCTTGG 
AGAGAGCTGG 



C CAGAGTCAGA 
T TTGCCCCCTG 
C CCTGTCCAGC 



TTAACTGAGA 



GGATATCGAG 
ACCCCGGATG 
'TTTCCTTCT 



2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 




GGAAGATTTG G 

CCTATATCAC AGCTAACTTC YTCAG7CTCA T 
TTGCCTCAAG ATGGGGGTTT GAAAATAACT TTACCTGACT 



GCTTCCAAGG 
CCCTGGCTTC 
ATGGGCTCTA 



GGCTCAGACA GCTCTGGGCC TTTTGACCAC AAGCCAGCCC 
GTCTTCTCTG CCCCAGGACT GCAGGGCGGC TTCCTCCAAG 
ATTTGGCTCC ATCCAAGAAG GCTCCAGCTC CCCTACTGGC 
CCCTGGGCCA GGSCCAGAGA GTGTGTCTCA GGAGAATTCA 
CAGAAAGTTT GGGCATTTGG GAAATTTTCA AGGRTGTATG 
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7ATGGYTCAC GTATGGWGCA GGTTGTCCTG GTCCYKGGGT GC\CC r 71 G \G3 3 540 

GAAGTGGATT GGAGGGGAGC TTGAGGAATA TAAGGAGCG: GGGTGGA3AC 7CAGGCTATG 3600 

GACAAGGACA GCCCCAAGGT TGGGAAGACC TGGCCTTAGT CGTCCTCAGC CTAGGGCAGG 3660 

GCAGTGAAGA AAGCTCTCCC CGCTCCTGCT GTAATGACCC AGAGTAGCCT CCCCAGGCCG 3720 

GCATCTTATG TGTGTCTTCC ACCATCCTCA TGGTGGCACT TTTCTAGGCC TGTCTCCCAG 3780 

CATTGTGCAA GGCTCGGAAG AGAACCAGGA AGTGAAACTG GGTGAAAACA GAAAGCTCAA 3840 

TGGATGGGCT AGGTTCCCAG ATC AT TAGGG CAGAGTTTGC ACGTCCTCTG GTTCACTGGG 3900 

AATCCACCCA GCCCACGAAT CATCTCCCTC TTTGAAGGAT TTTWATTTCT ACTGGGTTTT 3960 

GGAACAAACT CCTGCTGAGA CCCCACAGCC AGAAACTGAA AGCAGCAGCT CCCCAAAGCC 4020 

TGGAAAATCC CTAAGAGAAG GCCTGGGGGA MAGGAAKTGG AGTGACAGGG GACAGGTAGA 408 0 

GAGAAGGGGG CCCAATGGCC AGGGAGTGAA GGAGGTGGCG TTGCTGAGAG CAGTCTGCAC 414 0 

ATGCTTCTGT CTGAGTGCAG GAAGGTGTTC CAGGGTCGAA ATTACACTTC TCGTACCTGG 4200 

AGACGCTGTT TGTGGGAGCA CTGGGCTCAT GCCTGGCACA CAATAGGTCT GCAATAAACC 4260 
ATGGTTAAAT CCTGAAAAAA AAAAAAAAA 



] Y f i 1 I 1 i 1 

MTLGSPRKGL LMLLMALVTQ GDPVKPSRGP LVTCTCESPH CKGP7CRGAW CTWLVREEG 
RHPQEHRGCG NLHRELCRGR PTEFVNHYCC DSHLCNHNVS LVLEATQPPS EQPGTDGQLA 
LILGPVLALL ALVALGVLGL WHVRRRQEKQ RGLHSELGES 3LILKASEQG DTMLGDLLDS 
DCTTGSGSGL PFLVQRTVAR QVALVECVGK GRYGEVWRGL WHGESVAVKI FSSRDEQSVJF 
RETEIYNTVL LRHDNILGFI ASDMTSRNSS TQLWLITHYH EHGSLYDFLQ RQTLEPHLAL 
RLAVSAACGL AHLHVEIFGT QGKPAIAHRD FKSRNVLVKS NLOCCIADLG LAVMHSQGSD 
YLDIGNNPRV GTKRYMAPEV LDEQIRTDCF ESYKWTDIWA FGLVLWEIAR RTIVNGIVED 
YRPPFYDWP NDPSFEDHKK WCVDQQTPT IPNRLAADPV LSGLAQMMRE CWYPNPSARL 
TALRIKKTLQ KISNSPEKPK VIQ 

Seq ID NO : 334 DNA sequence 

Nucleic Acid Accession NM_004126.1 

Coding sequence: 108-329 

1 11 21 31 41 51 

I I I I I 

GGCACGAGCT CGTGCCGGCC TTCAGTTGTT TCGGGACGCG CCGAGCTTCG CCGCTCTTCC 
AGCGGCTCCG CTGCCAGAGC TAGCCCGAGC CCGGTTCTGG GGCGAAAATG CCTGCCCTTC 
ACATCGAAGA TTTGCCAGAG AAGGAAAAAC TGAAAATGGA AGTTGAGCAG CTTCGCAAAG 
AAl VAGTT GCAGAGACAA CAAGTGTCTA AATGTTCTGA AGAAATAAAG AACTATATTG 
AACAACGTTC TGGAGAGGAT CCTCTAGTAA AGGGAATTCC AGAAGACAAG AACCCCTTTA 
AAGAAAAAGG CAGCTGTGTT ATTTCATAAA TAACTTGGGA GAAACTGCAT CCTAAGTGGA 
AGAACTAGTT TGTTTTAGTT TTCCCAGATA AAACCAACAT GCTTTTTAAG GAAGGAAGAA 
TGAAATTAAA AGGAGACTTT CTTAAGCACC ATATAGATAG GGTTATGTAT AAAAGCATAT 
GTGCTACTCA TCTTTGCTCA CTATGCAGTC TTTTTTAAGA GAGCAGAGAG TATCAGATGT 
ACAATTATGG AAATAAGAAC ATTACTTGAG CATGACACTT CTTTCAGTAT ATTGCTTGAT 
GCTTCAAATA AAGTTTTGTC TT 



Seq ID NO: 335 Protein sequence 



NP_004117.1 



Seq ID NO: 336 DNA sequence 
60 Nucleic Acid Accession #: NM_005795 
: 555-1940 



1 11 21 31 41 51 

GCACGAGGGA ACAACCTCTC TCTCTSCAGC AGAGAGTGTC ACCTCCTGCT TTAGGACCAT SO 

CAAGCTCTGC TAACTGAATC TCATCCTAAT TGCAGGATCA CATTGCAAAG CTTTCACTCT 120 

TTCCCACCTT GCTTGTGGGT AAATCTCTTC TGCGGAATCT CAGAAAGTAA AGTTCCATCC 180 

TGAGAATATT TCACAAAGAA TTTCCTTAAG AGCTGGACTG GGTCTTGACC CCTGGAATTT 240 

AAGAAATTCT TAAAGACAAT GTCAAATATG ATCCAAGAGA AAATGTGATT TGAG7CTGGA 3 00 

GACAATTGTG CATATCGTCT AATAATAAAA ACCCATACTA GCCTATAGAA AACAATATTT 360 

GAATAATAAA AACCCATACT AGCCTATAGA AAACAATATT TGAAAGATTG CTACCACTAA 420 

AAAGAAAACT ACTACAACTT GACAAGACTG CTGCAAACTT CAATTGGTCA CCACAACTTG 480 

ACAAGGTTGC TATAAAACAA GATTGCTACA ACTTCTAGTT TATGTTATAC AGCATATTTC 540 

ATTTGGGCTT AATGATGGAG AAAAAGTGTA CCCTGTATTT TCTGGTTCTC TTGCCTTTTT 600 

TTATGATTCT TGTTACAGCA GAATTAGAAG AGAGTCCTGA GGACTCAATT C^GTTGGGAG 6S0 

TTACTAGAAA TAAAATCATG ACAGCTCAAT ATGAATGTTA CCAAAAGATT ATGCAAGACC 720 

CCATTCAACA AGCAGAAGGC GTTTACTGCA ACAGAACCTG GGATGGATGG CTCTGCTGGA 780 

ACGATGTTGC AGCAGGAACT GAATCAATGC AGCTCTGCCC TGATTACTTT CAGGACTTTG 840 

ATCCATCAGA AAAAGTTACA AAG AT CTGTG ACCAAGATGG AAACTGGTTT AGACATCCAG 900 

CAAGCAACAG AACATGGACA AATTATACCC AGTGTAATGT TAACACCCAC GAGAAAGTGA 960 

AGACTGCACT AAATTTGTTT TACCTGACCA TAATTGGACA CGGATTGTCT ATTGCATCAC 1020 
TGCTTATCTC GCTTGGCATA TTCTTTTATT TCAAGAGCCT AAGTTGCCAA a 
TACACAAAAA TCTGTTCTTC TCATTTGTTT GTAACTCTGT TGTAACS " - 
CTGCAGTGGC CAACAACCAG GCCTTAGTAG CCACAAATCC TGTTAGTTGC A 

AGTTCATTCA TCTTTACCTG ATGGGCTGTA ATTACTTTTG GATGCTCTGT GAAGGCATTT 1260 

ACCTACACAC ACTCATTGTG GTGGCCGTGT TTGCAGAGAA GCAACATTTA ATGTGGTATT 1320 

ATTTTCTTGG CTGGGGATTT CCACTGATTC CTGCTTGTAT ACATGCCATT GCTAGAAGCT 1380 



1200 
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TATATTACAA TGACAATTGC TGGATCAGTT CTGATACCOA TCTCCTCTAC ATTATCCATG 1440 

GCCCAATTTG TGCTGCTTTA CTGGTGAATC TTTTTTTCTT GTTAAATATT GTACGCGTTC 1500 

TCATCACCAA GTTAAAAGTT ACACACCAAG CGGAATCCAA TCTGTACATG AAAGCTGTGA 1560 

GAGCTACTCT TATCTTGGTG CCATTGCTTG GCATTGAATT TGTGCTGATT CCATGGCGAC 1620 

^ GATTGCAGAG GAGGTATATG ACTACATCAT GCACATCCTT ATGCACTTCC 1680 

:TCTACC ATTTTCTGCT TCTTTAATGG AGAGGTTCAA GCAATTCTGA 1740 

3 GAATCAATAC AAAATCCAAT TTGGAAACAG CTTTTCCAAC TCAGAAGCTC 1800 

; GTCTTACACA GTGTCAACAA TCAGTGATGG TCCAGGTTAT AGTCATGACT 1860 

GTCCTAGTGA ACACTTAAAT GGAAAAAGCA TCCATGATAT TGAAAATGTT CTCTTAAAAC 1920 

CAGAAAATTT ATATAATTGA AAATAGAAGG ATGGTTGTCT CACTGTTTGG TGCTTCTCCT 1980 

AACTCAAGGA CTTGGACCCA TGACTCTGTA GCCAGAAGAC TTCAATATTA AATGACTTTG 2040 

GGGAATGTCA TAAAGAAGAG CCTTCACATG AAATTAGTAG TGTGTTGATA AGAGTGTAAC 2100 

ATCCAGCTCT ATGTGGGAAA AAAGAAATCC TGGTTTGTAA TGTTTGTCAG TAAATACTCC 2160 

CACTATGCCT GATGTGACGC TACTAACCTG ACATCACCAA GTGTGGAA7T GGAGAAAAGC 2220 

ACAATCAACT TTTCTGAGCT GGTGTAAGCC AGTTCCAGCA CACCATTGAT GAATTCAAAC 2280 

AAATGGCTGT AAAACTAAAC ATACATGTTG GGCATGATTC TACCCTTATT CSCCCCAAGA 2340 

GACCTAGCTA AGGTCTATAA ACATGAAGGG AAAATTAGCT TTTAGTTTTA AAACTCTTTA 2400 

TCCCATCTTG ATTGGGGCAG TTGACTTTTT TTTTTTCCCA GAGTGCCGTA GTCCTTTTTG 2460 

TAACTACCCT CTCAAATGGA CAATACCAGA AGTGAATTAT CCCTGCTGGC TTTCTTTTCT 2520 

CTATGAAAAG CAACTGAGTA CAATTGTTAT GATCTACTCA TTTGCTGACA CATCAGTTAT 2580 

ATCTTGTGGC ATATCCATTG TGGAAACTGG ATGAACAGGA TGTA7AATAT GCAATCTTAC 2640 

TTCTATATCA TTAGGAAAAC ATCTTAGTTG ATGCTACAAA ACACCTTGTC AACCTCTTCC 2700 

TGTCTTACCA AACAGTGGGA GGGAATTCCT AGCTGTAAAT ATAAATTTTG CCCTTCCATT 27 60 

TCTACTGTAT AAACAAATTA GCAATCATTT TATATAAAGA AAATCAATGA AGGATTTCTT 2820 

ATTTTCTTGG AATTTTGTAA AAAGAAATTG TGAAAAATGA GCTTGTAAAT ACTCCATTAT 2880 

TTTATTTTAT AGTCTCAAAT CAAATACATA CAACCTATGT AATTTTTAAA GCAAATATAT 2940 

AATGCAACAA TGTGTGTATG TTAATATCTG ATACTGTATC TGGGCTGATT TTTTAAATAA 3000 
AATAGAGTCT GGAATGCT 



1 11 21 31 41 SI 

I I I I I I 

MEKKCTLYFL VLLPFPMILV TAELEESPED SIQLGVTRNK IMTAQYECYQ KIMQDPIQQA 

EGVYCNRTWD GWLCWNDVAA GTESMQLCPD YFQDFDPSEK VTKICDQDGM WFRHPASNRT 

WTNYTQCNVN THEKVKTALN LFYLTIIGHG LSIASLLISL GIFPYFKSLS CQRITLHKNL 

FPSFVCNSW TIIHLTAVAN NQALVATNPV SCKVSQFIHL YLMGCNYFWM LCEGIYLHTL 

IWAVPAEKQ HLMWYYPLGW GFPLIPACIH AIARSLYYND NCWISSDTHL LYIIHGPICA 

ALLVNLFFLL NIVRVLITKL KVTHQAESNL YMKAVRATLI LVPLLGIEFV LIPWRPSGKI 

AEEVYDYIMH ILMHFQGLLV STIFCFFNGE VQAILRRNWN QYKIQFGNSF SNSEALRSAS 

YTVSTISDGP GYSIIDCPSEII LNGKSIHDIE NVLLKPENLY N 



Seq ID NO: 338 DNA sequence 
Nucleic Acid Accession th NM_00179S 
:e: 2S-2379 



GCACGATCTG TTCCTCCTGG GAAGATGCAG AGGCTCATGA TGCTCCTCGC CACATCGGGC 60 

GCCTGCCTGG GCCTGCTGGC AGTGGCAGCA GTGGCAGCAG CAGGTGCTAA CCCTGCCCAA 120 

CGGGACACCC ACAGCCTGCT GCCCACCCAC CGGCGCCAAA AGAGAGATTG GATTTGGAAC 180 

CAGATGCACA TTGATGAAGA GAAAAACACC TCACTTCCCC ATCATGTAGG CAAGATCAAG 240 

TCAAGCGTGA GTCGCAAGAA TGCCAAGTAC CTGCTCAAAG GAGAATATGT GGGCAAGGTC 300 

TTCCGGGTCG ATGCAGAGAC AGGAGACGTG TTCGCCATTG AGAGGCTGGA CCGGGAGAAT 360 

ATCTCAGAGT ACCACCTCAC TGCTGTCATT GTGGACAAGG ACACTGGTGA AAACCTGGAG 420 

ACTCCTTCCA GCTTCACCAT CAAAGTTCAT GACGTGAACG ACAACTGGCC TGTGTTCACG 480 

CATCGGTTGT TCAATGCGTC CGTGCCTGAG TCGTCGGCTG TGGGGACCTC AGTCATCTCT 540 

GTGACAGCAG TGGATGCAGA CGACCCCACT GTGGGAGACC ACGCCTCTGT CATGTACCAA 600 

ATCCTGAAGG GGAAAGAGTA TTTTGCCATC GATAATTCTG GACGTATTAT CACAATAACG 660 

AAAAGCTTGG ACCGAGAGAA GCAGGCCAGG TATGAGATCG TGGTGGAAGC GCGAGATGCC 720 

CAGGGCCTCC GGGGGGACTC GGGCACGGCC ACCGTGCTGG TCACTCTGCA AG ACATCAAT 780 

GACAACTTCC CCTTCTTCAC CCAGACCAAG TACACATTTG TCGTGCCTGA AGACACCCGT 840 

GTGGGCACCT CTGTGGGCTC TCTGTTTGTT GAGGACCCAG ATGAGCCCCA GAACCGGATG 900 

ACCAAGTACA GCATCTTGCG GGGCGACTAC CAGGACGCTT TCACCATTGA GACAAACCCC 960 

GCCCACAACG AGGGCATCAT CAAGCCCATG AAGCCTCTGG ATTATGAATA CAT CCAGCAA 1020 

TACAGCTTCA TCGTCGAGGC CACAGACCCC ACCATCGACC TCCGATACAT GAGCCCTCCC 1080 

GCGGGAAACA GAGCCCAGGT CATTATCAAC ATCACAGATG TGGACGAGCC CCCCATTTTC 1140 

CAGCAGCCTT TCTACCACTT CCAGCTGAAG GAAAACCAGA AGAAGCCTCT GAT TGGCACA 12 00 

GTGCTGGCCA TGGACCCTGA TGCGGCTAGG CATAGCATTG GATACTCCAT CCGCAGGACC 12 SO 

AGTGACAAGG GCCAGTTCTT CCGAGTCACA AAAAAGGGGG ACATTTACAA TGAGAAAGAA 1320 

CTGGACAGAG AAGTCTACCC CTGGTATAAC CTGACTGTGG AGGCCAAAGA ACTGGATTCC 13 B0 

ACTGGAACCC CCACAGGAAA AGAATCCATT GTGCAAGTCC ACATTGAAGT TTTGGATGAG 1440 

AATGACAATG CCCCGGAGTT TGCCAAGCCC TACCAGCCCA AAGTGTGTGA GAACGCTGTC 1500 

CATGGCCAGC TGGTCCTGCA GATCTCCGCA ATAGACAAGG ACATAACACC ACGAAACGTG 15 60 

AAGTTCAAAT TCACCTTGAA TACTGAGAAC AACTTTACCC TCACGGATAA TCACGATAAC 1620 

ACGGCCAACA TCACAGTCAA GTATGGGCAG TTTGACCGGG AGCATACCAA GGTCCACTTC 1680 

CTACCCGTGG TCATCTCAGA CAATGGGATG CCAAGTCGCA CGGGCACCAG CACGCTGACC 1740 

GTGGCCGTGT GCAAGTGCAA CGAGCAGGGC GAGTTCACCT TCTGCGAGGA TATGGCCGCC 1800 

CAGGTGGGCG TGAGCATCCA GGCAGTGGTA GCCATCTTAC TCTGCATCCT CACCATCACA 1860 

GTGATCACCC TGCTCATCTT CCTGCGGCGG CGGCTCCGGA AG»GGCCCG CGCGCACGGC 1920 

AAGAGCGTGC CGGAGATCCA CGAGCAGCTG GTCACCTACG ACGAGGAGGG CGGCGGCGAG 1980 

ATGGACACCA CCAGCTACGA TGTGTCGGTG CTCAACTCGG TGCGCCGCGG CGGGGCCAAG 2 040 

CCCCCGCGGC CCGCGCTGGA CGCCCGGCCT TCCCTCTATG CGC^GGTGCA GAAGCCACCG 2100 

AGGCACGCGC CTGGGGCACA CGGAGGGCCC GGGGAGATGG CAGCCATGAT CGAGGTGAAG 2160' 

AAGGACGAGG CGGACCACGA CGGCGACGGC CCCCCCTACG ACACGCTGCA CATCTACGGC 2220 

TACGAGGGCT CCGAGTCCAT AGCCGAGTCC CTCAGCTCCC TGGGCACCGA CTCATCCGAC 2280 
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TCTGACGTGG ATTACGACTT CCTTAACGAC TGGGGACCCA GGTTTAAGAT GCTGGCTGAG 2340 

CTGTACGGCT CGGACCCCCG GGAGGAGCTG CTGTATTAGG CGGCCGAGGT CACTCTGGGC 24 00 

CTGGGGACCC AAACCCCCTG CAGCCCAGGC CAGTCAGACT CCAGGCACCA CAGCCTCCAA 24S0 

AAATGGCAGT GACTCCCCAG CCCAGCACCC CTTCCTCGTG GGTCCCAGAG ACCTCATCAG 2520 

CCTTGGGATA GCAAACTCCA GGTTCCTGAA ATATCCAGGA ATATATGTCA GTGATGACTA 2580 

TTCTCAAATG CTGGCAAATC CAGGCTGGTG TTCTGTCTGG GCTCAGACAT CCACATAACC 2640 

CTGTCACCCA CAGACCGCCG TCTAACTCAA AGACTTCCTC TGGCTCCCCA AGGCTGCAAA 27 00 

GCAAAACAGA CTGTGTTTAA CTGCTGCAGG GTCTTTTTCT AGGGTCCCTG AACGCCCTGG 27S0 

TAAGGCTGGT GAGGTCCTGG TGCCTATCTG CCTGGAGGCA AAGGCCTGGA CAGCTTGACT 2820 

TGTGGGGCAG GATTCTCTGC AGCCCATTCC CAAGGGAGAC TGACCATCAT GCCCTCTCTC 2 8 80 

GGGAGCCCTA GCCCTGCTCC AACTCCATAC TCCACTCCAA GTGCCCCACC ACTCCCCAAC 2 940 

CCCTCTCCAG GCCTGTCAAG AGGGAGGAAG GGGCCCCATG GCAGCTCCTG ACCTTGGGTC 3000 
CTGAAGTGAC CTCACTGGCC TGCCATGCCA GTAACTGTGC TGTACTG 
ATTCAGGGAA ATGCTTATTA AACCTTGAAG CAACTGIGAA TTCATTCTGG A 

GAGATCAGGA GTGACAGATC ACAGGGTGAG GGCCACCTCC ACACCCACCC CCTCTGGAGA 3180 

AGGCCTGGAA GAGCTGAGAC CTTGCTTTGA GACTCCTCAG CACCCCTCCA GTTTTGCCTG 32 40 

AGAAGGGGCA GATGTTCCCG GAGATCAGAA GACGTCTCCC CTTCTCTGCC TCACCTGGTC 33 00 

GCCAATCCAT GCTCTCTTTC TTTTCTCTGT CTACTCCTTA TCCCTTGGTT TAGAGGAACC 3360 

CAAGATGTGG CCTTTAGCAA AACTGACAAT GTCCAAACCC ACTCATGACT GCATGACGGA 3420 

GCCGAGCATG TGTCTTTACA CCTCGCTGTT GTCACATCTC AGGGAACTGA CCCTCAGGCA 34 80 

CACCTTGCAG AAGGAAGGCC CTGCCCTGCC CAACCTCTGT GGTCACCCAT GCATCATTCC 3 540 

ACTGGAACGT T TCACTG CAA ACACACCTTG GAGAAGTGGC ATCAGTCAAC AGAGAGGGGC 3 600 

AGGGAAGGAG ACACCAAGCT CACCCTTCGT CATGGACCGA GG7TCCCACT CTGGCAAAGC 36 60 

CCCTCACACT GCAAGGGATT GTAGATAACA CTGACTTGTT TGTTTTAACC AATAACTAGC 3720 

TTCTTATAAT GATTTTTTTA CTAATGATAC TTACAAGTTT CTAGCTCTCA CAGACATATA 37 80 

GAATAAGGGT TTTTGCATAA TAAGCAGGTT GTTATTTAGG TTAACAATAT TAAT7CAGGT 3 8 40 

G GAAAAACAAT TCCTGTAACC TTCTATTTTC TATAATTGTA GTAATTGCTC 3 900 

T GTCTATATAT TGGCCAAACT GGTGCATGAC AAGTACTGTA TTTTTTTATA 3 960 
CCTAAATAAA GAAAAATCTT TAGCCTGGGC AACAAAAAAA 

Seq ID NO: 33 9 Protein sequence 
Protein Aoceaaion #: NP_001786 



MQRLMMLLAT SGACLGLLAV AAVAAAGANP AQRDTHSLLP THRRQKRDWI WUQMHIDEEK 
NTSLPHHVGK IKSSVSRKNA KYLLKGEYVG KVFRVDAETG DVFAIERLDR ENISEYHLTA 
VIVDKDTGEN LETPSSPTIK VHDVNDNWPV PTHRLPNASV PESSAVGTSV I SVTAVDADD 
PTVGDHASVM YQILKGKEYP AIDNSGRIIT ITKSLDREKQ ARYEIW3AR DAQGLRGDSG 
TATVLVTLQD INDNFPFFTQ TKYTFWPED TRVGTSVGSL FVEDPDEPQN RMTKYSILRG 
DYQDAFTIET NPAHNEGIIK PMKPLDYEYI QQYSFIVEAT DPTIDLRYMS PPAGNRAQVI 
INITDVDEPP IFQQPFYHFQ LKENQKKPLI GTVLAMDPDA ARHSIGYSIR RTSDKGQFFR 
VTKKGDIYNE KELDREVYPW YNLTVEAKEL DSTGTPTGKE SIVQVHISVL DENDNAFEFA 
KPYQPKVCEN AVHGQLVLQ I SAIDKDITPR NVKFKFTLNT ENNFTLTDNH DNTANITVKY 
GQFDREIITKV IIFLPWISDN GMPSRTGTST LTVAVCKCNS QGEFTFCEDM AAQVGVSIQA 
WAILLCILT ITVITLLIFL RRRLRKOARA HGKSVPEIHS QLVTYDEEGG GEMDTTSYDV 
SVLNSVRRGG AKPPRPALDA RPSLYAQVQK PPRHAPGAHG GPGEMAAMIE VKKDEADHDG 



Seq ID NO: 340 DNA sequence 
Coding sequence: 112-1593 

i i 1 T i 1 i 1 f 

GCGGAGGGTG CGTGCGGGCC GCGGCAGCCG AACAAAGGAG CAGGGGCGCC GCC 
CCCGCCACCC ACCTCCCGGG GCCGCGCAGC GGCCTCTCGT CTACTGCCAC CATGACCGCC 
AACGGCACAG CCGAGGCGGT GCAGATCCAG TTCGGCCTCA TCAACTGCGG CAACAAGTAC 
CTGACGGCCG AGGCGTTCGG GTTCAAGGTG AACGCGTCCG CCAGCAGCCT GAAGAAGAAG 
CAGATCTGGA CGCTGGAGCA GCCCCCTGAC GAGGCGGGCA GCGCGGCCGT GTGCCTGCGC 
AGCCACCTGG GCCGCTACCT GGCGGCGGAC AAGGACGGCA ACGTGACCTG CGAGCGCGAG 
GTGCCCGGTC CCGACTGCCG TTTCCTCATC GTGGCGCACG ACGACGGTCG CTGGTCGCTG 
CAGTCCGAGG CGCACCGGCG CTACTTCGGC GGCACCGAGG ACCGCCTGTC CTGCTTCGCG 
CAGACGGTGT CCCCCGCCGA GAAGTGGAGC GTGCACATCG CCATGCACCC TCAGGTCAAC 
ATCTACAGTG TCACCCGTAA GCGCTACGCG CACCTGAGCG CGCGGCCGGC CGACGAGATC 



720 



GCGCGCCCCG AGCCGGCCAC TGGCTACACG CTGGAGTTCC GCTCCGGCAA GGTGGCCTTC 

CGCGACTGCG AGGGCCGT T A CCTGGCGCCG TCGGGGCCCA GCGGCACGCT CAAGGCGGGC 

AAGGCCACCA AGGTGGGCAA GGACGAGCTC TTTGCTCTGG AGCAGAGCTG CGCCCAGGTC SJUU 

GTGCTGCAGG CGGCCAACGA GAGGAACGTG TCCACGCGCC AGGGTATGGA CCTGTCTGCC 960 

AATCAGGACG AGGAGACCGA CCAGGAGACC TTCCAGCTGG AGATCGACCG CGACACCAAA 1020 

AAGTGTGCCT TCCGTACCCA CACGGGCAAG TACTGGACGC TGACGGCCAC CGGGGGCGTG 1080 

CAGTCCACCG CCTCCAGCAA GAATGCCAGC TGCTACTTTG ACATCGAGTG GCGTGACCGG 1140 

CGCATCACAC TGAGGGCGTC CAATGGCAAG TTTGTGACCT CCAAGAAGAA TGGGCAGCTG 1200 

GCCGCCTCGG TGGAGACAGC AGGGGACTCA GAGCTCTTCC TCATGAAGCT CATCAACCGC 12 60 

CCCATCATCG TGTTCCGCGG GGAGCATGGC TTCATCGGCT GCCGCAAGGT CACGGGCACC 1320 

CTGGACGCCA ACCGCTCCAG CTATGACGTC TTCCAGCTGG AGTTCAACGA TGGCGCCTAC 1380 

AACATCAAAG ACTCCACAGG CAAATACTGG ACGGTGGGCA GTGACTCCGC GGTCACCAGC 1440 

AGCGGCGACA CTCCTGTGGA CTTCTTCTTC GAGTTCTGCG ACTATAACAA GGTGGCCATC 1500 

AAGGTGGGCG GGCGCTACCT GAAGGGCGAC CACGCAGGCG TCCTGAAGGC CTCGGCGGAA 1560 

ACCGTGGACC CCGCCTCGCT CTGGGAGTAC TAGGGCCGGC CCGTCCTTCC CCGCCCCTGC 162 0 

CCACATGGCG GCTCCTGCCA ACCCTCCCTG CTAACCCCTT CTCCGCCAGG TGGGCTCCAG 1680 

GGCGGGAGGC AAGCCCCCTT GCCTTTCAAA CTGGAAACCC CAGAGAAAAC GGTGCCCCCA 1740 

CCTGTCGCCC CTATGGACTC CCCACTCTCC CCTCCGCCCG GGTTCCCTAC T 
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3 CTGGGAGGGA 



GAAGCGGCIA 
TTTGCCTCTC 
CTGTCAGTGG 



AGGGACGGTT 
CCAGCCACCT 
CCCTCCCTGG 
AGGACTGACC 
GGAGAGGCTC 



CCTCCCAGCC 
TGCACTGTCC 
CTTGTGGTGT 
AGCCTGGCTC 
GTTCTGCCAA 



TTTCAGATGC CCCTGCCCTC 
CCTCAGACG3 CTCTGAGCCT 
AGCCCTGGGC GTGTAGTGTA 
CCCCAGGAGA GCTGGGCACA 
CCGAAACCCC TGCTTGGGAA 
GGTGGCTGGA 

CCTTCCCTOG 



PCT/US02/12476 



CAAATCAGTA 
GTAGTAGCGA 
CCCCCTCTTT 
GCCAGAGCCC 
CGCCCCCTCC 
TCCCCAACAT 
TATAACTCTA 
AGTCTGC 



T CCTCCTGTCT CTTTCC 



GTGATCTGGC 
CCGTCCTTCC 
CTGCTGTGAT 
GGGAGCCCTG 
GCATCTCACT 
AACGCCCATG 




TGGTGCTCCC 



ACCCTAGCCT GACTGGAAGC A 
CGTCCCAGGC A 
TCAGCACCCT CCCCAGGGGG T 
CAGCCCTGGG CCTGGGCTGC C 
TGGGCCTCCC GGGTGGATGA A 
CCGGGGCCCC CCTGCTGCCA G 



ATAGCGAAA? A 



I 



I 



I 



I 



MTANGTAEAV QIQFGLINCG NKYLTAEAFG FKVNASASSL KKKQIWTLEQ P 
CLRSHLGRYL AADKDGNVTC EREVPGPDCR FLIVAHDDSR WSLQSSAHRR YFGGTEDRLS 
CFAQTVSEAE KWSVHIAMHP QVNIYSVTRK RYAHLSARPA DEIAVDRDVP WGVDSLITLA 
FQDQRYSVQT ADHRFLRHDG RLVARPEPAT GYTLEFRSGK VAFRDCEGRY LAP3GPSGTL 
KAGKATKVGK DELFALEQSC AQWLQAANE RKVSTRQGMD LSANQDEETD QETFQLEIDR 
DTKKCAFRTH TGKYWTLTAT GGVQSTASSK NASCYFDIEW RDRRITLRAS NGKFVTSKKN 
GQLAASVETA GDSELFLHKL INRPIIVFRG EHGFIGCRKV TGTLDANRSS YDVFQLEFND 
GAYNIKDSTG KYWTVGSDSA VTSSGDTPVD FFFEFCDYNK VAIKVGGRYL KGDHAGVLKA 
SAETVDPASL WEY 



CACCCCACTG 
GCTACCATGA 
ACCGTGCGTG 



ACATTTCCTG 
GCAGTTGGGG 
GAAGCCAGCT 
ACAGATCCAA 
CCCTGTGCCT 



TGCGCGGTAC 
ACCTGCGCGC 
GCCCCGGGGC 



CAGTTACTTG 



AGCGGCAGGC 
CGGGCGCCGA 
GCTCCAAAGA 
AGAACGAGCC 
ACACGCTGCA 
TCCAGGCCAC 
CAACGGCTAC 
CGCCTCTAAC 
TCCACCTGGG 
CATCGCGGAC 




TAGACGACTT 
GCCGCTCTTG 
CCAGGCGCCC 
TCGACGAGAA 
TTCCTGAGAT 
AAGCCGAGTC 
CGACTTCCTC 
TGAGCACAGC 
TCTGCTTTCA 
TGGAGAGTGA 
GGGTGAAAGT 
CCCCTCTTGG 



TCCTCGATGG 
AAAGGCCACT 
TGCCACTCCT 
AGTAGTAGTG 



GTGGGTGGAG 
CGGTGGGGTC 
CTGTGCAAGT 
TTGAGCTATC 
ACCGAGGTGA 
GAAATCGGCG 
TACCTCCGTG 
GCCTGCGAAT 
GGGGAAGGAC 
GCAACCAGCC 
ACACCACTTG 
GGATCACAGA 
ATCACCCC 



AGCCGACCCT 



GCACGATGTC 



TCCTGAGCCC 



CAGGCTTTCG 
TTGGTGATCT 
TCTTCCCAGC 
GCTGCTTTGG 



ACTCCTCCTC 
TGACCATGAC 
CAAGGAAGGA 
GCTCCAGTTC 
ACAGAGCAGA 



CTTCGAGCTG 
TGGGGGGACC 
GAGAACATGG 
AGACAATTCA 
TACCCTTCAA 
GATTTCCAAG 
TGCCGTGGTC 
AGTACTGGGG 
GTCTATGGGC 
TGCACATTGC 
GGGTGCCTTG 



AGCGCCGCTC 
CTCCCGATCT 
GGCGATGTGT 
CCTAACTGCC 
GGGAAGGACG 
GGGGTGCCCA 
CCAATCAGGG 



CTTGTCAAGC 
CCGCCGGGCC 
ACAAACAATG 



CTCTAGTGAT G 



? KAFATKAKID 



3 DRAPDTALRP 
3 LGGLLQPAPR 
RRGLQRPAVL GRTGAQAFPL 



DFSPPGTEVS 
DDLGGFACEC 
DEKLGETPLV 
TSSATPQAFD 
ESDPEPAALG 



ALCRGQLPIS 
PEQDNSVTSI 
SSSAHCTNNG 



HPGERAFAGF 
LRANGYLCKY 
VTCIADEIGA 



I I 
CTAKETIIRV NSQPTDWQKT 
GNCWPLGPRG DSWQLGGP3G 
SACGASSNEA GVRPVPPLAG 
LHPARWGAQH RACGRRAARC 
LLAVLRPRRS RKR:4AAVGGG 
QFEVLCPAPR PGAASNLSYR 



FAIYPSDKGV 
ARAEGKGGGT 
AIARAGRRRT 
ARAPAGRPRA 



PEIPRWGSQS 
STAVWLVIL 
VKVGDCDLRD 



APFQLHSAAL 
GKCAELPNCL 
VPQRTWPIRV 
GSVISKFNST 
RKESMGPPGL 
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Seq ID NO: 344 DNA sequence 
Nucleic Acid Accession #: NM_012072 
Coding sequence: 149-2107 

I r t T i 1 i 

AAAGCCCTCA GCCITTGTGT CCTTCTCTGC GCCGGAGTGG CTGCAGCTCA CCCCTCAGCT SO 

CCCCTTGGGG CCCAGCTGGG AGCCGAGATA GAAGCTCC?G TCGCCGCTGG GCTTCTCGCC 120 

TCCCGCAGAG GGCCACACAG AGACCGGGAT GGCCACCTCC ATGGGCCTGC TGCTGCTGCT 180 

GCTGCTGCTC CTGACCCAGC CCGGGGCGGG GACGGGAGCT GACACGGAGG CGGTGGTCTG 240 

CGTGGGGACC GCCTGCTACA CGGCCCACTC GGGCAAGCTG AGCGCTGCCG AGGCCCAGAA 300 

CCACTGCAAC CAGAACGGGG GCAACCTGGC CACTGTGAAG AGCAAGGAGG AGGCCCAGCA 360 

CGTCCAGCGA GTACTGGCCC AGCTCCTGAG GCGGGAGGCA GCCCTGACGG CGAGGATGAG 420 

CAAGTTCTGG ATTGGGCTCC AGCGAGAGAA GGGCAAGTGC C?GC C TCTC CGCT 480 

GAAGGGCTTC AGCTGGGTGG GCGGGGGGGA GGACACGCCT TACTCTAACT GGCACAAGGA 540 

GCTCCGGAAC TCGTGCATCT CCAAGCGCTG TGTGTCTCTG C7GCTGGACC TGTCCCAGCC 600 

GCTCCTTCCC AACCGCCTGC CCAAGTGGTC TGAGGGCCCC TGTGGGAGCC CAGGCTCCCC 660 

CGGAAGTAAC ATTGAGGGCT TCGTGTGCAA GTTCAGCTTC AAAGGCATGT GCCGGCCTCT 720 

GGCCCTGGGG GGCCCAGGTC AGGTGACCTA CACCACCCCC TTCCAGACCA CCAGTTCCTC 780 

CTTGGAGGCT GTGCCCTTTG CCTCTGCGGC CAATGTAGCC TGTGGGGAAG GTGACAAGGA 840 

CGAGACTCAG AGTCATTATT TCCTGTGCAA GGAGAAGGCC CCCGATGTGT TCGACTGGGG 900 

CAGCTCGGGC CCCCTCTGTG TCAGCCCCAA GTATGGCTGC AACTTCAACA ATGGGGGCTG 960 

CCACCAGGAC TGCTTTGAAG GGGGGGATGG CTCCTTCCTC TGCGGCTGCC GACCAGGATT 1020 

CCGGCTGCTG GATGACCTGG TGACCTGTGC CTCTCGAAAC CCTTGCAGCT CCAGCCCATG 1080 

TCGTGGGGGG GCCACGTGCG TCCTGGGACC CCATGGGAAA AACTACACGT GCCGCTGCCC 1140 

CCAAGGGTAC CAGCTGGACT CGAGTCAGCT GGACTGTGTG GACGTGGATG AATGCCAGGA 1200 

CTCCCCCTGT GCCCAGGAGT GTGTCAACAC CCCTGGGGGC TTCCGCTGCG AATGCTGGGT 1260 

TGGCTATGAG CCGGGCGGTC CTGGAGAGGG GGCCTGTCAG GATGTGGATG AGTGTGCTCT 1320 

GGGTCGCTCG CCTTGCGCCC AGGGCTGCAC CAACACAGAT GGCTCATTTC ACTGCTCCTG 1380 

TGAGGAGGGC TACGTCCTGG CCGGGGAGGA CGGGACTCAG TGCCAGGACG TGGATGAGTG 1440 

TGTGGGCCCG GGGGGCCCCC TCTGCGACAG CTTGTGCTTC AACACACAAG GGTCCTTCCA 1500 

CTGTGGCTGC CTGCCAGGCT GGGTGCTGGC CCCAAATGGG GTCTCTTGCA CCATGGGGCC 1560 

TGTGTCTCTG GGACCACCAT CTGGGCCCCC CGATGAGGAG GACAAAGGAG AGAAAGAAGG 1620 

GAGCACCGTG CCCCGCGCTG CAACAGCCAG TCCCACAAGG GGCCCCGAGG GCACCCCCAA 1680 

GGCTACACCC ACCACAAGTA GACCTTCGCT GTCATCTGAC GCCCCCATCA CATCTGCCCC 1740 

ACTCAAGATG CTGGCCCCCA GTGGGTCCTC AGGCGTCTGG AGGGAGCCCA GCATCCATCA 1800 

CGCCACAGCT GCCTCTGGCC CCCAGGAGCC TGCAGGTGGG GACTCCTCCG TGGCCACACA 1860 

AAACAACGAT GGCACTGACG GGCAAAAGCT GCTTTTATTC TACATCOTAG GCACCGTGGT 1920 

GGCCATCCTA CTCCTGCTGG CCCTGGCTCT GGGGCTACTG GTCTATCGCA AGCGGAGAGC 1980 

GAAGAGGGAG GAGAAGAAGG AGAAGAAGCC CCAGAATGCG GCAGACAGTT ACTCCTGGGT 2040 

TCCAGAGCGA GCTGAGAGCA GGGCCATGGA GAACCAGTAC AGTCCGACAC CTGGGACAGA 2100 

CTCCTGAAAG TGAGGTGGCC CTAGAGACAC TAGAGTCACC AGCCACCATC CTCAGAGCTT 2160 

TGAACTCCCC ATTCCAAAGG GGCACCCACA TTTTTTTGAA AGACTGGACT GGAATCTTAG 2220 

CAAACAATTG TAAGTCTCCT CCTTAAAGGC CCCTTGGAAC ATGCAGGTAT TTTCTACGGG 2 2 B0 

TGTTTGATGT TCCTGAAGTG GAAGCTGTGT GTTGGCGTGC CACGGTGGGG ATTTCGTGAC 2 3 40 

TCTATAATGA TTGTTACTCC CCCTCCCTTT TCAAATTCCA ATGTGACCAA TTCCGGATCA 2400 

GGGTGTGAGG AGGCTGGGGC TAAGGGGCTC CCCTGAATAT CTTCTCTGCT CACTTCCACC 2 460 

ATCTAAGAGG AAAAGGTGAG TTGCTCATGC TGATTAGGAT TGAAATGATT TGT -TCTCTT 2 520 

CCTAGGATGA AAACTAAATC AATTAATTAT TCAATTAGGT AAGAAGATCT GGTTTTTTGG 2580 

TCAAAGGGAA CATGTTCGGA CTGGAAACAT TTCTTTACAT T7GCATTCCT CCATTTCGCC 2 640 

AGCACAAGTC TTGCTAAATG TGATACTGTT GACATCCTCC AGAATGGCCA GAAGTGCAAT 2700 

TAACCTCTTA GGTGGCAAGG AGGCAGGAAG TGCCTCTTTA GTTCTTACAT TTCTAATAGC 2760 

CTTGGGTTTA TTTGCAAAGG AAGCTTGAAA AATATGAGAA AAGTTGCTTG AAGTGCATTA 2820 

CAGGTGTTTG TGAAGTCACA TAATCTACGG GGCTAGGGCG AGAGAGGCCA GGGATTTGTT 2 880 

CACAGATACT TGAATTAATT CATCCAAATG TACTGAGGTT ACCACACACT TGACTACGGA 2 940 

TGTGATCAAC ACTAACAAGG AAACAAATTC AAGGACAACC TGTCTTTGAG CCAGGGCAGG 3000 

CCTCAGACAC CCTGCCTGTG GCCCCGCCTC CACTTCATCC TGCCCGGAAT GCCAGTGCTC 3060 

CGAGCTCAGA CAGAGGAAGC CCTGCAGAAA GTTCCATCAG GCTGTTTCCT AAAGGATGTG 312 0 

TGAACGGGAG ATGATGCACT GTGTTTTGAA AGTTGTCATT TTAAAGCATT TTAGCACAGT 3180 

TCATAGTCCA CAGTTGATGC AGCATCCTGA GATTTTAAAT CCTGAAGTGT GGGTGGCGCA 3240 

CACACCAAGT AGGGAGCTAG TCAGGCAGTT TGCTTAAGGA ACTTTTGTTC TCTGTCTCTT 3300 

TTCCTTAAAA TTGGGGGTAA GGAGGGAAGG AAGAGGGAAA GAGATGACTA ACTAAAATCA 3360 

TTTTTACAGC AAAAACTGCT CAAAGCCATT TAAATTATAT CCTCATTTTA AAAGTTACAT 342 0 

TTGCAAATAT TTCTCCCTAT GATAATGCAG TCGATAGTGT GCACTCTTTC TCTCTCTCTC 3480 

TCTCTCTCAC ACACACACAC ACACACACAC ACACACACAC AGAGACACGG CACCATTCTG 3540 

CCTGGGGCAC TGGAACACAT TCCTGGGGGT CACCGATGGT CAGAGTCACT AGAAGTTACC 3600 

TGAGTATCTC TGGGAGGCCT CATGTCTCCT GTGGGCTTTT TACCACCACT GTGCAGGAGA 3660 

ACAGACAGAG GAAATGTGTC TCCCTCCAAG GCCCCAAAGC CTCAGAGAAA GGGTGT7TCT 3720 

GGTTTTGCCT TAGCAATGCA TCGGTCTCTG AGGTGACACT CTGGAGTGGT TGAAGGGCCA 3780 

CAAGGTGCAG GGTTAATACT CTTGCCAGTT T TGAAAT AT A GATGCIATGG TTCAGATTGT 3840 

TTTTAATAGA AAACTAAAGG GGCAGGGGAA GTGAAAGGAA AGATGGAGGT TTTGTGCGGC 3900 

TCGATGGGGC ATTTGGAACT TCTTTTTAAA GTCATCTCAT GGTCTCCAGT TTTCAGTTGG 3960 

AACTCTGGTG TTTAACACTT AAGGGAGACA AAGGCTGTGT CCATTTGGCA AAACTTCCTT 4020 

GGCCACGAGA CTCTAGGTGA TGTGTGAAGC TGGGCAGTCT GTGGTGTGGA GAGCAGCCAT 4080 

CTGTCTGGCC ATTCAGAGGA TTCTAAAGAC ATGGCTGGA? GCGCTGCTGA CCAACATCAG 4140 

CACTTAAATA AATGCAAATG CAACATTTCT CCCTCTGGGC CTTGAAAATC CTTGCCCTTA 4200 

TCATTTGGGG TGAAGGAGAC ATTTCTGTCC TTGGCTTCCC ACAGCCCCAA CGCAGTCTGT 4260 

GTATGATTCC TGGGATCCAA CGAGCCCTCC TATTTTCACA GTGTTCTGAT TGCTCTCACA 4320 

GCCCAGGCCC ATCGTCTGTT CTCTGAATGC AGCCCTGTTC TCAACAACAG GGAGGTCATG 4380 

GAACCCCTCT GTGGAACCCA CAAGGGGAGA AATGGGTGAT AAAGAATCCA GTTCCTCAAA 4440 

ACCTTCCCTG GCAGGCTGGG TCCCTCTCCT GCTGGGTGGT GCTTTCTCIT GCACACCACT 4500 

CCCACCACGG GGGGAGAGCC AGCAACCCAA CCAGACAGCT CAGGTTGTGC ATCTGATGGA 4560 

AACCACTGGG CTCAAACACG TGCTTTATTC TCCTGTTTAT TTTTGCTGTT ACTTTGAAGC 4620 

ATGGAAATTC TTGTTTGGGG GATCTTGGGG CTACAGTAGT GGGTAAACAA ATGCCCACCG 4680 

GCCAAGAGGC CATTAACAAA TCGTCCTTGT CCTGAGGGGC CCCAGCTTGC TCGGGCGTGG 4740 

CACAGTGGGG AATCCAAGGG TCACAGTATG GGGAGAGGTG CACCCTGCCA CCTGCTAACT 4 800 
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• TCTCGCTAGA CACAGTGTTT CTGCCCAGGT GACCTGTTCA GCAGCAGAAC AAGCCAGGGC 4860 

CATGGGGACG GGGGAAGTTT TCACTTGGAG ATGGACACCA AGACAATGAA GATTTGTTGT 4 920 

CCAAATAGGT CAATAATTCT GGGAGACTCT TGGAAAAAAC TGAATATATT CAGGACCAAC 4 980 

TCTCTCCCTC CCCTCATCCC ACATCTCAAA GCAGACAATG TAAAGAGAGA ACATCTCACA S040 

CACCCAGCTC GCCATGCCTA CTCATTCCTG AATTTCAGGT GCCATCACTG CTCTTTCTTT 5100 

CTTCTTTGTC ATTTGAGAAA GGATGCAGGA GGACAATTCC CACAGATAAT CTGAGGAATG 5160 

CAGAAAAACC AGGGCAGGAC AGTTATCGAC AATGCATTAG AACTTGGTGA GCATCCTCTG 5220 

TAGAGGGACT CCACCCCTGC TCAACAGCTT GGCTTCCAGG CAAGACCAAC CACATCTGGT 5280 

CTCTGCCTTC GGTGGCCCAC ACACCTAAGC GTCATCGTCA TTGCCATAGC ATCATGATGC 5340 

AACACATCTA CGTGTAGCAC TACGACGTTA TGTTTGGGTA AIGTGGGGAT GAACTGCATG 5400 

AGGCTCTGAT TAAGGATGTG GGGAAGTGGG CTGCGGTCAC TGTCGGCCTT GCAAGGCCAC 5460 

CTGGAGGCCT GTCTGTTAGC CAGTGGTGGA GGAGCAAGGC TTCAGGAAGG GCCAGCCACA 5520 

TGCCATCTTC CCTGCGATCA GGCAAAAAAG TGGAATTAAA AAGTCAAACC TTTATATGCA 5580 

TGTGTTATGT CCATTTTGCA GGATGAACTG AGTTTAAAAG AATTTTTTTT TCTCTTCAAG 5640 

TTGCTTTGTC TTTTCCATCC TCATCACAAG CCCTTGTTTG AGTGTCTTAT CCCIGAGCAA 5700 

TCTTTCGATG GATGGAGATG ATCATTAGGT ACTTTTGTTT CAACCTTTAT TCCTGTAAAT 5760 

ATTTCTGTGA AAACTAGGAG AACAGAGATG AGATTTGACA AAAAAAAATT GAATTAAAAA 5820 

TAACACAGTC TTTTTAAAAC TAACATAGGA AAGCCTTTCC TATTATT7CT CTTCTTAGCT 5880 

TCTCCATTGT CTAAATCAGG AAAACAGGAA AACACAGCTT TCTAGCAGCT GCAAAATGGT 5940 

TTAATGCCCC CTACATATTT CCATCACCTT GAACAATAGC TTTAGCTTGG GAATCTGAGA 600 0 

TATGATCCCA GAAAACATCT GTCTCTACTT CGGCTGCAAA ACCCATGGTT TAAATCTATA 6060 

TGGTTTGTGC ATTTTCTCAA CTAAAAATAG AGATGATAAT CCGAATTCTC CATATATTCA 612 0 

CTAATCAAAG ACACTATTTT CATACTAGAT TCCTGAGACA AATACTCACT GAAGGGCTTG 6180 

TTTAAAAATA AATTGTGTTT TGGTCTGTTC TTGTAGATAA TGCCCTTCTA T7TTAGGTAG 624 0 

AAGCTCTGGA ATCCCTTTAT TGTGCTGTTG CTCTTATCTG CAAGGTGGCA AGCAGTTCTT 63 0 0 

TTCAGCAGAT TTTGCCCACT ATTCCTCTGA GCTGAAGTTC TTTGCATAGA T7TGGCTTAA 6360 

GCTTGAATTA GATCCCTGCA AAGGCTTGCT CTGTGATGTC AGATGTAATT GTAAATGTCA 6420 

GTAATCACTT CATGAATGCT AAATGAGAAT GTAAGTATTT TTAAATGTGT GTATTTCAAA 6480 

TTTGTTTGAC TAATTCTGGA ATTACAAGAT TTCTATGCAG GATTTACCTT CATCCTGTGC 6540 

ATGTTTCCCA AACTGTGAGG AGGGAAGGCT CAGAGATCGA GCTTCTCCTC TGAGTTCTAA 6600 

CAAAATGGTG CTTTGAGGGT CAGCCTTTAG GAAGGTGCAG CTTTGITGTC CTTTGAGCTT 6660 
TCTGTTATGT GCCTATCCTA ATAAACTCTT AAACACATT 

Seg ID NO; 345 Protein sequence 



1 11 21 31 41 51 

I I I I I I 

MATSMGLLLL LLLLLTQPGA GTGADTEAW CVGTACYTAII SGKLSAAEAQ NHCNQNGGNL 60 

ATVKSKEEAQ HVQRVLAQLL RREAALTARM SKFWIGLQRE KGKCLDPSLP LKGFSWVGGG 120 

EDTPYSNWHK ELRNSCISKR CVSLLLDLSQ PLLPNRLPKW SEGPCGSPGS PGSMEGFVC 180 

KFSFKGMCRP LALGGPGQVT YTTPFQTTSS SLEAVPFASA ANVACGEGDK DETQSHYFLC 240 

KEKAPDVFDW GSSGPLCVSP KYGCNFNNGG CHQDCFEGGD GSFLCGCRPG FRLLDDLVTC 300 

ASRNPCSSSP CRGGATCVLG PHGKNYTCRC PQGYQLDSSQ LDCVDVDECQ DSPCAQECVH 3 60 

TPGGFRCECW VGYEPGGPGE GACQDVDECA LGRSPCAQGC TNTDGSFHCS CEEGYVLAGE 420 

DGTQCQDVDE CVGPGGPLCD SLCFNTQGSF HCGCLPGWVL APNGVSCTMG PVSLGPPSGP 480 

PDEEDKGEKE GSTVPRAATA SPTRGPEGTP KATPTTSRPS LSSDAFITSA PLKMLAPSGS 540 

SGVWREPSIH HATAASGPQE PAGGDSSVAT QMNDGTDGQK LLLFYILGTV VAILLLLALA 60 0 

LGLLVYRKRR AKREEKKEKK PQNAADSYSW VPERAESRAM ENQYSPTPGT DC 

Seq ID NOs 346 DNA sequence 
Nucleic Acid Accession tf : Z31560 
Coding sequence; cl-966 



AGCCCGGACC GCGTCAAGCG GCCCATGAAT GCCTTCATGG TGTGGTCCCG CGGGCAGCGG 
CGCAAGATGG CCCAGGAGAA CCCCAAGATG CACAACTCGG AGATCAGCAA GCGCCTGGGC 
GCCGAGTGGA AACTTTTGTC GGAGACGGAG AAGCGGCCGT TCATCGACGA GGCTAAGCGG 
CTGCGAGCGC TGCACATGAA GGAGCACCCG GATTATAAAT ACCGGCCCCG GCGGAAAACC 
AAGACGCTCA TGAAGAAGGA TAAGTACACG CTGCCCGGCG GGCTGCTGGC CCCCGGCGGC 
AATAGCATGG CGAGCGGGGT CGGGGTGGGC GCCGGCCTGG GCGCGGGCGT GAACCAGCGC 
ATGGACAGTT ACGCGCACAT GAACGGCTGG AGCAACGGCA GCTACAGCAT GATGCAGGAC 
CAGCTGGGCT ACCCGCAGCA CCCGGGCCTC AATGCGCACG GCGCAGCGCA GATGCAGCCC 
ATGCACCGCT ACGACGTGAG CGCCCTGCAG TACAACTCCA TGACCAGCTC GCAGACCTAC 
ATGAACGGCT CGCCCACCTA CAGCATGTCC TACTCGCAGC AGGGCACCCC TGGCATGGCT 
CTTGGCTCCA TGGGTTCGGT GGTCAAGTCC GAGGCCAGCT CCAGCCCCCC TGTGGTTACC 
TCTTCCTCCC ACTCCAGGGC GCCCTGCCAG GCCGGGGACC TCCGGGACAT GATCAGCATG 
TATCTCCCCG GCGCCGAGGT GCCGGAACCC GCCGCCCCCA GCAGACTTCA CATGTCCCAG 
CACTACCAGA GCGGCCCGGT GCCCGGCACG GCCATTAACG GCACACTGCC CCTCTCACAC 
ATGTGAGGGC CGGACAGCGA ACTGGAGGGG GGAGAAATTT TCAAAGAAAA ACGAGGGAAA 
T GCAAAAGAGG AGAGTAAGAA ACAGCATGGA GAAAACCCGG TACGCTCAAA 



HSARMYNMME TELKPPGPQQ TSGGGGGNST AAAAGGNQKN SPDRVKRPMN AFKVWSRGQR 
RKMAQENPKM HNSEISKRLG AEWKLLSETE KRPFIDEAKR LRALHMK3HP DYKYRPRRKT 
KTLMKKDKYT LPGGLIAPGG NSMASGVGVG AGLGAGVNQR MDSYAHMNGW SNGSYSMMQD 
QLGYPQHPGL NAHGAAQMQP MHRYDVSALQ YNSMTSSQT 'J' 5F1 3MS Y5CQGTFGMA 
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5 

10 



CCCTTGTAAA 



GCCAGGCCAA 



TGAGGGCCAG CAGCTTCTTG ATCGTGGTGG TGTTCCTCAT C 

CACGGGAGTT CCTGTTAAAG GTCAAGACAC TGTCAAAGGC CGTGTTCCAT 
AGATCCCGTT AAAGGACAAG TTTCAGTTAA AGGTCAAGAT A 
CGCAAGAGCC AGTCAAAGGT CCAGTCTCCA CTAAGCC 
TCCGGTGCGC CATGTTGAAT CCCCCTAACC GCTGCTTGAA AGATACTGAC TGCCCAGGAA 

CGGTCCTTGC TGCACCTGTG CCGTCCCCAG AGCTACAGGC CCCATCTGGT CCTAAGTCCC 
TGCTGCCCTT CCCCTTCCCA CACTGTCCAT TCTTCCTCCC ATTCAGGATG CCCACGGCTG 
GAGCTGCCTC TCTCATCCAC TTTCCAATAA A 



Seq ID NO-. 350 DNA sequence 
Nucleic Acid Accession ft: NM_007183 
Coding sequence: 75-2468 

31 



60 
65 
70 



I 



I 



CGCCTCTTGC 
CGAGCCTGAG GCCGAGACTG 
TCTCGCTCTC 



GAATTCCGGA CAGGACGTGA AGATAGTTGG GTTTGGAGGC G 
GTGGACCTGC CGCCATGCAG GACGGTAACT TCCTGCTGTC GGCCCTGCAG CCTGAGGCCG 120 
GCGTGTGCTC CCTGGCGCTG CCCTCTGACC TGCAGCTGGA C 
CGGAGGCCGA GCGGCTGCGG GCAGCCCGCG T 
AGCTGGGACA GCAGCCGCGG CACAACGGGG CCGCTGAGCC C 
CCAGAGGCAC ATCCAGGGGG CAGTACCACA CCCTGCAGGC T 
AGGGCCTGAG TGGGGACAAG ACCTCGGGCT T 
CAGCCTCCTG G 

CCATGCCCAC CAGGCCCGTG 
ATGACACACT CTCCCTGCGC TCGCTGCGGC 
GCCTGGTGTC TGAGCAGCTG GAGCCCGCGG 
AGCGCCAGGC CAGCTCCAGC TCCAGCCGGG 
AGGTTTCCCC GAGCCGGACC ATCCGTGCCC 



CCACCTCCAC 



CTCGAGCGCC 
TTGATGACAT 



AGCCTCAGCC 
TACGGTAGCC 
TCAGCAGTCA 



AAGTGCAGCG 
AGCTGGCCCT 
ATGATGAGCT 
TGAAGGACCG 



ACCGCCTCTA CGACG 



CGGCTTCCTC 
CCACGGGCTG 
CGAGGACAAG 



GTGCCTAGGC 
GGTGCCATGC 
AACGGGATCT 
GTCACAGGGA 
GACACGCTGG 
CCCCTCATCC 
AGGAACCTCA 
GTGGACGCCC 



CTGCCGTGCG 
CAGTGCCGGG 
TCAGCCTGGC 
ACCGAACCCT 
AGTACCTCAT 
AGTGCTACAG 



GACCCTGCAG CGATTCCAGA 
GGCCGTCCTG G 
TGACTCGGGC C 
GCAGAGACTC A 
GGCTTCAGAC CCCAACCTGC 
CGATGCAGCC GCCAAGAAGC 



GCAACCTCAT 



TCCTGTGGAA 
AGCAGCTCAC 
AGCAGAACGC 
GCTCAGCCTC 
TGGTCACCTC 
ACGCGGTGTG 



GGACCTGGTG T 



CGTCCTGCGG AACCTGTCCT 



GGGACCTGGC GGGGGCGCCG CCGGGAGAGG T 
GCTGCCCCTC GCCGCCGATG C 
CGAGTGGCTG TGGAGCCCCC A 
GCTCAACCGG CACACGACGG 



TTCTGAACCC 
CTGGCCTCAT 
TGGTGAGCCA 



CCTGCTAGAC 
CCGAAACCTG 
CCTGATCGAG 



I CGCGGAGGTG TCCAAGGACC 
3 GCTGTACAAC 
CGGGGCGCTG 
GGCCCTGGAG 



GCCCATAGGT 
TCAGCTCCAG 
TCGCTGGGGC 
ATAGCTGGGG 
TGTATGGGGT 
ATCTTGGGAT 
AAAAGGAATT C 



CAGTGAGAAG 
GCTCCACCGT 
GAAGCCTTCT 

CCCTGTGTGC 
ACTTCGCTTC 
GGTGACCCAG 



TCTCGGAACG 
AAGCTGCCAG 
GCTGTGCTCA 
GACGGACTCC 
TCCTCCCGGG 
GACTTTCGGG 
GGAGGAGAAG 
AGCCCAGCCT 

CGCAGGGCAG 
TCACATTGGC 
GGGAATAAAG 



CAGAACATCA 
CAGGAGCGTA 
CGCTCACTGA 



TGAGCCGCCT 
CCGCCGACCA 
CTAGGAACAA 
GCAGCGTGGG TGAGAAGTCG CCCCCAGCCG 
ACAACCTGGT 
GAAAGCTCAT 
CAGCATCCAG 



GTGACGTGGC 
GGAGGAGAAG 
GTCCTGGGCC 
GGGGTGGGGC 
AGAGGTGGGG 
ATGGCCATGA 



CTTCATCAAG 
CCTCCTGGCC 
TCGGAAGGAG 



AAGAAGCGGG : 

AACCTGTGGC : 
GACTTCCTGG 

AGGGACAGAC : 

GAGGGGCCCC I 

GCAGGGTCTT I 

GCTGCTCTGG : 
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I I I I I I 

MQDGNFLLSA LQPEAGVCSL ALPSDLQLDR EGAEGPEAEE LRAARVQEQV RARLLQLGQQ 60 

PRHNGAAEPE PEAETARGTS RGQYHTLQAG FSSRSQGLSG DKTSGFRPIA KPAYSPASWS 120 

SRSAVDLSCS RRLSSAHNGG SAPGAAGYGG AQPTPPMPTR PVSFHERGGV GSRADYDTLS 18 0 
5 LRSLRLGPGG LDDRYSLVSE QLEPAATSTY RAFAYERQAS 3SSSRAGGLD WPEATEVSPS 240 

RTIRAPAVRT LQRFQSSHRS RGVGGAVPGA VLEPVARAPS VRSLSLSLAD SGHLPDVHGF 300 

NSYGSHRTLQ RLSSGFDDID LPSAVKYLMA SDPNLQVLGA AYIQHKCYSD AAAKKQARSL 360 

QAVPRLVKLF NHANQEVQRH ATGAMRNLIY DNADNKLALV 3ENGIFELLR TLREQDDELR 420 

KNVTGILWHL SSSDHLKDRL ARDTLEQLTD LVLSPLSGAG G 
10 FLRNLSSASQ ATRQKMRECH GLVDALVTSI NHALDAGKCE D 

EMPPSALQRL EGRGRRDLAG APPGEWGCF TPQSRRLREL PLAADALTFA EVSKDPKGLE 

WLWSPQIVGL YNRLLQRCEL NRHTTEAAAG ALQNITAGDR RWAGVLSRLA LEQERILNPL 

LDRVRTADHH QLRSLTGLIR NLSRNARNKD EMSTKWSHL IEKLPGSVGE KSPPAEVLVN 

IIAVLNNLW ASPIAARDLL YFDGLRKLIF IKKKRDSPDS EKSSRAASSL LAMLWQYNKL 
15 HRDFRAKGYR KEDFLGP 

Seq ID NO: 352 DNA sequence 
Nucleic Acid Accession fj: M31469 
20 Coding sequence; 1-651 

i i 1 T T i 1 i 1 

ATGGCTGCGC AGGGAGAGCC CCAGGTCCAG TTCAAACTTG TATTGGTTGG TGATGGTGGT 
25 ACTGGAAAAA CGACCTTCGT GAAACGTCAT TTGACTGGTG AAT7TGAGAA GAAGTATGTA 

GCCACCTTGG GTGTTGAGGT TCATCCCCTA GTGTTCCACA CCAACAGAGG ACCTATTAAG 

TTCAATGTAT GGGACACAGC CGGCCAGGAG AAATTCGGTG GACTGAGAGA TGGCTA7TAT 

ATCCAAGCCC AGTGTGCCAT CATAATGTTT GATGTAACAT CGAGAGTTAC TTACAAGAAT 

GTGCCTAACT GGCATAGAGA TCTGGTACGA GTGTGTGAAA ACATCCCCAT TGTGTTGTGT 
30 GGCAACAAAG TGGATATTAA GGACAGGAAA GTGAAGGCGA AATCCATTGT CTTCCACCGA 

AAGAAGAATC TTCAGTACTA CGACATTTCT GCCAAAAGTA ACTACAACTT TGAAAAGCCC 

TTCCTCTGGC TTGCTAGGAA GCTCATTGGA GACCCTAACT TGGAATTTGT TGCCATGCCT 

GCTCTCGCCC CACCAGAAGT TGTCATGGAC CCAGCTTTGG CAGCACAGTA TGAGCACGAC 
^ TTAGAGGTTG CTCAGACAAC TGCTCTCCCG GATGAGGATG ATGACCTGTG A 

Seq ID NO: 3 53 Protein sequence 



MAAQGEPQVQ FKLVLVGDGG TGKTTFVKRH LTGEFEKKYV ATLGVEVHPL VFHIKRGPIK 60 

FNVWDTAGQE KFGGLRDGYY IQAQCAIIMF DVISRVTYKN VPNKHRDLVR VCENIPIVLC 120 

GNKVDIKDRK VKAKSIVFHR KKNLQYYDIS AKSNYNFEKP FLWLARKLIG DPNLEFVAMP 180 
ALAPPEWMD PALAAQYEIID LEVAQTTALP DEDDDL 

45 

Seq ID NO: 354 DNA sequence 
Nucleic Acid Accession #: NM_002820 
^ Coding sequence: 304-831 

1 11 21 31 41 51 

I I I I I I 

CCGGTTCGCA AAGAAGCTGA CTTCAGAGGG GGAAACTTTC TTCTTTTAGG AGGCGGTTAG 60 

CCCTGTTCCA CGAACCCAGG AGAACTGCTG GCCAGATTAA TTAGACATTG CTATGGGAGA 12 0 

55 CGTGTAAACA CACTACTTAT CATTGATGCA TATATAAAAC CATTTTATTT TCGCTATTAT 180 

TTCAGAGGAA GCGCCTCTGA TTTGTTTCTT TTTTCCCTTT TTGCTCTTTC TGGCTGTGTG 240 

GTTTGGAGAA AGCACAGTTG GAGTAGCCGG TTGCTAAATA AGTCCCGAGC GCGAGCGGAG 300 

ACGATGCAGC GGAGACTGGT TCAGCAGTGG AGCGTCGCGG TGTTCCTGCT GAGCTACGCG 3 60 

GTGCCCTCCT GCGGGCGCTC GGTGGAGGGT CTCAGCCGCC GCCTCAAAAG AGCTGTGTCT 420 

60 GAACAT CAGC TCCTCCATGA CAAGGGGAAG TCCATCCAAG ATTTACGGCG ACGATTCTTC 480 

CTTCACCATC TGATCGCAGA AATCCACACA GCTGAAATCA GAGCTACCTC GGAGG7GTCC 54 0 

CCTAACTCCA AGCCCTCTCC CAACACAAAG AACCACCCCG TCCGATTTGG GTCTGATGAT 60 0 

GAGGGCAGAT ACCTAACTCA GGAAACTAAC AAGGTGGAGA CGTACAAAGA GCAGCCGCTC 660 

AAGACACCTG GGAAGAAAAA GAAAGGCAAG CCCGGGAAAC GCAAGGAGCA GGAAAAGAAA 72 0 

65 AAACGGCGAA CTCGCTCTGC CTGGTTAGAC TCTGGAGTGA CTGGGAGTGG GCTAGAAGGG 780 

GACCACCTGT CTGACACCTC CACAACGTCG CTGGAGCTCG ATTCACGGTA ACAGGCTTCT 840 

CTGGCCCGTA GCCTCAGCGG GGTGCTCTCA GCTGGGTTTT GGAGCCTCCC TTCTGCCTTG 900 

GCTTGGACAA ACCTAGAATT TTCTCCCTTT ATGTATCTCT ATCGATTGTG 7AGCAATTGA 960 

CAGAGAATAA CTCAGAATAT TGTCTGCCTT AAAGCAGTAC CCCCCTACCA CACACACCCC 1020 

70 TGTCCTCCAG CACCATAGAG AGGCGCTAGA GCCCATTCCT CTTTCTCCAC CGTCACCCAA 1080 

CATCAATCCT TTACCACTCT ACCAAATAAT TTCATATTCA AGCTTCAGAA GCTAGTGACC 1140 

ATCTTCATAA TTTGCTGGAG AAGTGTATTT CTTCCCCTTA CTCTCACACC TGGGCAAACT 12 00 

TTCTTCAGTG TTTTTCATTT CTTACGTTCT TTCACTTCAA GGGAGAATAT AGAAGCATTT 12 60 

GATATTATCT ACAAACACTG CAGAACAGCA TCATGTCATA AACGATTCTG AGCCATTCAC 1320 

75 ACTTTTTATT TAATTAAATG TATTTAATTA AATCTCAAAT TTATTTTAAT GTAAAGAACT 1380 

TAAATTATGT TTTAAACACA TGCCTTAAAT TTGTTTAA7T AAATTTAACT CTGGTTTCTA 1440 

CCAGCTCATA CAAAATAAAT GGTTTCTGAA AATGTTTAAG TATTAACTTA CAAGGATATA 15 00 

GGTTTTTCTC ATGTATCTTT TTGTTCATTG GCAAGATGAA ATAATTTTTC TAGGGTAATG 1560 
CCGTAGGAAA AATAAAACTT CACATTTAAA AAAAA 



319 
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S GVTGSGLEGD HLSDTSTTSL ELDSR 



PCT/US02/12476 



ATGGGCCTCC CCGAGCCGGG CCCTCTCCGG CTTCTGGCGC TGCTGCTGCT GCTGCTGCTG 
CTGCTGCTGC TGCGGCTCCA GCATCTTGCG GCGGCAGCGG C 

CCAAGGAGTG CGAAAAGGAC CAATTCCAGT GCCGGAACGA GCGCTGCATC 
CGAGGACGAT GACTGCTTAG ACCACAGCGA CGAGGACGAC 
AGACCTGTGC AGACAGTGAC TTCACCTGTG ACAACGGCCA CTGCATCCAC 
AGTGTGACGG CGAGGAGGAG TGTCCTGATG GCTCCGATGA G7CCGAGGCC 
TCCTGCAGAG AAGCTGAGCT G 



TGCCCCAAGA 
GAACGGTGGA 
ACTTGCACCA 
TGTGTACCTG 
GCCGGCTGTG 
ACATGTGTCC 
GAAGCTGGCT 
ATCTGCACTG 
GACCAGAAGA 



CTGACCAAGA 
ACGAGTGCGG 
AATGTCGTGG 
TACCGTAAGA 



ATCTACTGGA 
CGACGCACTC 
CGAGGGTTCA 



CTACCTCACT 
TTGCAATCAA 
GCCTACAGGG 
ACCTCAAGAT 
CTTGTGGCGA 
ACAAGGGCTA 
ACTGCAAGGC 
AGGATCGACC 
CACTAGATGT 
TCTATAGCGC 
AGCAGTTGCA 
CTGACTCGGG 
TCTTCAGCCG 



GGGCACCTGC 
GCACTGCAAC 
GCTGAACGAG 



IGCTTTGAA TGCACGTGCC C 
CATTGATGAG 

TGCTGCTGGC 



r CCAGCTCCTG 



AACGGTGTGG ACCGGCAAAC 
CTGGATCTGC 
ATTGACTTCA 
CCTTTTGGGA 
ATTTTCAGTG 
AACCCACATG 
GAGCIGAGTG 
TCCAGCCACT CTCCCAAGTA 
ATGAAGAGGT 



GTGGAGGCAA 

CAAATCGGCT 
ACATTGTCAT 



GGAAGTTGCC 
CTACATGGAC 
CTCTCCAGAG 
CAATAAGACC 
TAACCTCAGT 
TGACTGGGGG 
ACTGGTGTCA 
CTTGTACTGG 



GAGTGCTACC 
AAGAGCCCAT 
AACTATTCAC 



AAGGCCAGTG 



CCCTAATCTT 
GCCTCATCCC 
TCTACTGGTG 
ACCCGAAAGA 
TGGACTGGGT 



GACCAGGCCA 



TGAGGACAAG 
CAATGGCCTG 
CTTCCATGAG 
TGGAGGCTGT 
CACATGTGCC 



GTAGACTCCA 
CTGATCTCCT 
GTGTTCTGGA 
GAAATCTCCA 



CCATCGCTGT 
AGATTGAGAA 
AATGGCCCAA 
AGCTACACCA 
CCACTGACTT 
CAGACCTGGA 



GATGGACCTA 
CACCAACCGC 
CATGCTCAAG 
TGACCTCTCC 
GCGGGAGGTC 
CCACAAGCAC 
\ TGGTGGCCGC 



GAATACCTGT G 



TGACCCCCTG 
ATCTGGGCTC 
CGGAATCACC 
ACTGTCCAGC 
CCTGAGCCAC 
GAACGAGGCC 
GAACCTCAAC 
AGATGCCTGT 
TCCTCAGA 



CTGATCTGGA 



GGATCATCGT G 



G GTGATAGCCC 



TGAATTTTGA 



G AACTGCTCAG 



3 TCTATCCTGC A 



GATTTTTTTT 



GGATGAATGG 
TTTAAATTTA 
GATGTGAGAG 
TGAGGAATTC 



I GGGGGGCTTT 



TGTTGCGGAA 
TTTTTCTATG 
GTGGAATGGC 
TACCTTACTC 
TTTAGG7TTT 
AACATAAGT 



AGGTAACCAC 
TATAATGTTT 
TACTGCTGAC 



GGGCATTTGT 



CATGCACTAC 
GTGAGTGTAT 
AAAGTTATGA 
TATACACTTT 
TAACATGATG 
AACTATATTT 
TTTTTGTAAA 



ACTCCGGATG 
GTGTGTGTGT 
TGAACTGCAA 
7TAACTGGTT 
CACATAACCA 
ACAGAAGATG 
TAAGATGATT 



2340 
2400 



MGLPEPGPLR LLALLLLLLL LLLLRLQHLA AAAADPLLGG QGPAKECEKD QFQCRNERCI 50 

PSVWRCDEDD DCLDHSDEDD CPKKTCADSD FTCDNGHCIH ERWKCDGEEE CPDGSDESEA 120 

TCTKQVCPAE KLSCGPTSHK CVPASWRCDG EKDCEGGADE AGCATSLGTC RGDEFQCGDG 180 

TCVLAIKHCN QEQDCPDGSD EAGCLQGLNE CLHNWGGCSH ICTDLKIGFE CTCPAGFQLL 240 

DQKTCGDIDE CKDPDACSQI CVNYKGYFKC ECYPGCEMDL LTKNCKAAAG KSPSLIFTNR 300 

NWALDVEVA TNRIYWCDLS YRKIYSAYMD KASDPKEREV 3S0 

1YWTDSGNKT ISVATVDGGR RRTLFSRNLS EPRAIAVDPL* 420 

NGVDRQTLVS DNIEWPNGIT LDLLSQRLYW VDSKLHQLSS 480 

PFGIAVFEDK VFWTDLENEA IFSANRLNGL EISILAENUJ 540 

ELSVQPNGGC EYLCLPAPQI SSHSPKYTCA CPDTMWLGPD 600 

LIWRNKKRXN TKSKNFDKPV 660 



DQAKIEKSGL 
LISSTDFLSH 
LKQPRAPDAC 



YRKTTEEEDE DELHIGRTAQ IGHVYPARVA LSLEDDGLP 



Coding sequence: 



I 



I 



~ AATTTCAAAT CTGATCTATT CGGCTTAGCG ACTGAAGATT 
GACGCTGCCC GATCGCCTCG GAAGTCCCCT GGACCATCAC AGAAGCCGAG CTTCGGGTAA 
CTCTCACAGT GGAGGGTAAG TCCATCCCCT GTTTAATCGA TACGGGGGCT ACCCACTCCA 
CGTTGCCTTC TTTTCAAGGG CCTGTTTCCC TTGCCCCCAT AACTGTTGTG GGTATTGACG 
GCCAAGCTTC AAAACCCCTG AAAACTCCCC CACTCTGGTG O 
TTATGCACTC TTTTTTAGTT ATCCCCACCT GCCCACTTCC C 
TAACCAAATT ATCTGCTTCC CTGACTATTC CTGGAGTACA GCTACATCTC ATTGCTGCCC 
TTCTTCCCAA TCCAAAGCCT CCTTTGTGTC CTCTAACATC CCCACAATAT CAGCCCTTAC 
CACAAGACCT CCCTTCAGCT TAATCTCTCC CACTCTAGGT TCCCACGCCG CCCCTAATCC 
CACTTGAAGC AGCCCTGAGA AACATCGCCC ATTCTCTCTC CATACCACCC OCCAAAAATT 
TTCGCCGCTC CAACACTTCA ACACTATTTT GTTTTATTTG TCTTATTAAT ATCAGAAGGC 
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AGGAATGTCA GGCCTCTGAG CCCAGGCCAG GCCATCGCAT CCCCTGTGAC TTGCACGTAT 
ACATCCAGAT GGCCTGAAGT AAC TGAAGAT CCACAAAAGA AGTAAAAACA GCCTTAACTG 
ATGACATTCC ACCATTGTGA TTTGTTCCTG CCCCACCCTA ACTGATCAAT GTACTTTGTA 
ATCTCCCCCA CCCTTAAGAA GGTTCTTTGT AATTCTCCCC ACCCTTGAGA ATGTACTTTG 
TGAGATCCAC CCCTGCCCAC CAGAGAACAA CCCCCTTTGA TTGTAATTTT TTATTACCTT 



PCT/US02/12476 



S DLFGLATEDW RCPIASEVPW TITEAELRVT LTVEGKSIPC LIDTGATHST 
LPSFQGPVSL APITWGIDG QASKPLKTPP LWCQLGQHSF MHSFLVIPTC PLPLLGRNIL 
TKLSASLTIP GVQLHLIAAL LPNPKPPLCP LTSPQYQPLP QDLPSA 



70 
75 



I 



TTCGTGGGTT 
GGTGGAAAAC 
TCTTCCAAGC 
TTCACAATTC 
CTAAAGGCTC 
AACAGTTATT 



ACATCCCTGG 
GAGTTCACAG 
GAAACGGTGG 
TAGAGAGGTC 
TCCAGAGGGA 
AGATACTGCT 
TCCAGGTGGA 
AATTCAGTCT 



CGATACCTTG 



GAAAGGACGA 
GGGGCTCGGA 

TCACCGTAAC AACCCTCGCA 
CTCCAGTTGA TGTACTAAAA 
TTGCACAAAC 
ACAACTCAGT 
AATACTATTT 
TTCCTTTTAT CTATATATAA TGAGCATGGT 



CCCTCTCCCT CCCCAATGGC 



CCCCAGAAGA CTATCCCCTC T 
TAGCAATCAG CC 
CGAAACCACT TC 



T TTCTGTTTGA AGACCACACT 



C TGACGGGAAG T 



TTGACCTTCC 240 

GCACTAGATT 300 

AGAAAGAATT 360 

GCCCCAACAA 420 

ACAGTAAAAC 480 

ATTCAGCAAA 540 

GGAAAACCTG 600 



GTGATCCCAA 
CACCCAAGGC 
TCGAATATGA 
GACCCACTGT 
AAGAATACAA 
GGACAAATGA 



GCAGGGATTC 



CCAGAGGAAA 
TGATCTTCTG 
AGATAAACCA 
TATTACAGAA 
ACCAGCAGTG 



GAAGTTTTTG 
GACTACTGTG 
CAGGAACCTC 
GGGGAAGCAG 
ACAATAGCAC 
ATGGAAAGTT 
GTTGAAGAAA 
AATTCTGAGG 
GTAGATGGAG 
ACAAGCCCCC 



AGGGGGACAT 
AGCATTATAG 
AGATAGATGA 
AGTATAAAGA 



TGATTGTAAG 
CAATGGAATC 
TCAGCAGTTT 
TCCAGACTGT 



AAGAAAACCA 
ACGGTTTTTG 
TTGATCACAG 
GACTCTTCAG 



ACCAGACAGA 
TA7TTACTGA 
ATACACTATA 



CTAATGAAGA 



GGCTGAAAGT GTAACAGAGG 
AAACATCGTT GATGATTTTC 

AGAATATCTA AOGGGAGAGG 
TGAAAACAAA GAAATAGAC3 
ATATGATTTT TATGAATATA 



TGCATATGGA GAGAAAGGAC 




CAACCATCTC 

TGAGAGGCCC ACCTGGCCCA 
GTTCATCTGG GGCCAAAGGT 
AGGGTCCCCC TGGTCCAACG 
GAAGAGGAAT 



AAGCTGGCCC 



CTGGGCCTCC 
CAATTGGTCC 
GTGCTGATGG 
TGGGTCCCCC 
CAGAIGGTGT 
GATTCAAAGG 



TGGAATGAGG 
ACGAGGTTTG 
TGTAGATGGC 
AGGTCAACAA 



CACAGGGGTG 
GGAGAAGATG 
CTGGGTCCAA 
CCCCCAGGAC 
GGGAATCCAG 



TGACATGGGT 



CTTCAGGTCA 



AAGGTGCACG 
CTCGAGGTTC 
GTGGCGATGG 
TTGGATTCCC 
ACCCTGGGCA 
GAGTGGTTGG 



AGCAGGAGAA 
TCCAAAGGGT 
GGGAGTAGCT 
AAGAGGTGCA 
CCCTCCTGGC 
TGGACCAAAA 
ACGTGGGGAG 
ACCACAGGGA 



GAGAAATTGG ACCAAGAGGT 
AGGAGCTCCA 
CATGGGTCCC 



CATCCTGGGA 
GGTCCTATTG 
AAGGGATCTA AAGGTGAAAA GGGTGAAGAT 
CTAAAAGGTG 
GGACCCAAAG 
AAGGGAAAAC 



GCAGATGGAG 
GGACTTCCGG 
CCAGGTCCTC 



AAGGGGGCTC 



2340 
24C0 



GTCGAGCAGG CCCAACTGGA 
TTGGAGTTCC AGGATTACCA 
TCCCTGGGTT TCCAGGTC-CC 



GGCAAACCAG 
AGAGGTCCCA CTGGGAAACC TGGGCCf 
CCTCCAGGTG 
GGCCCTCCTG 
ACTGGATTTC 
CCAACCGGTG 



GACCCAGGTC 
GGATATCCAG 
AATGGAGAGA 



C TGCCCAGGAC 



GCCCTCCCGG 



TCCAGGTCCT 

GGGCCCACCA 

TGCTCCTGGA 
TGGTCTCCCA 
AATTGGTGAG 



AAAGAGGTCC TCAAGGACC" 
GACCACCAGG AAGGATGGGC 
AAGGCAAGAC CGGCCCTCCT 

AGACTGGTCC AATAGGGGAA CGTGGGTATC 312 
GTCTTCCTGG TGCTGCAGGA 



T CCCTGGACCT 



CTTCCTGGAG CTCAGGGTGC ACCTGGACTG AAAGGAGGGG 
GGTCCAGTTG GCTCACCAGG AGAACGTGGG 
CGAGGGCGCC CGGGACCTCA GGGTCCTCCT 
GAAAAAGGTC CCCAAGGGCC TGCAGGGAGA GATGGAGTTC 
GGGCCAGCTG GTCCTGCCGG CTCCCCTGGG 
CCGGGACAAA AAGGCAGCAA GGGTGGCAAG 
CTTCAAGGAC CAGTTGGTGC CCCTGGAATT 
GGACAGCAGG GGATGTTTGG GCAAAAAGGT 
ITGGTCCAA TAGGTCTTCA 



2820 
2380 
2940 
3000 



3 TGAAAATGGG GATGTTGGTC CATGGGGGCC ACCTGGTCCT CCAGGCCCAA 



3240 
3300 
3360 



3660 
3840 
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GAGGCCCTCA AGGTCCCAAT 
CAGTTGGTGG TGTTGGAGAA 
GGGAAGCAGG TGTAGGCGGT 
CTGGAGCTGC TGGACCTCCA 



GGAGCTGATG GACCACAAGG 
AAGGGTGAAC CTGGAGAAGC 
CCCAAAGGAG 



ACCCCCAGGT TCTGTTGGTT 
AGGAAACCCA C-GGCCTCCTG 
GCTGGTCCAC 



PCT/US02/12476 



GGCCGCCAGG TGATGATGGC CCTAAGGGTA 



GCTCCAAGGG 



GTCCTCAAGG 



CAATCTTGTC 
AACAAGACAT 



TGTTGGTGGT 
TGGTGAGGCT 
AGAGGGAAGA 
AAAAACCGGC 
GGGCATCCCT 
ACCTGGTCCT 
TGAAAAGGGA 
AGGTGACCGA 
TCCTGGTCCT 
CCCAAAGGGT 
AGGGCCTCCT 



GTCCTCCTGG 
CTCAGGGACC 



ACCAAGGTTG 
CTTGCATTTA 
AACCAGGAAG 
AAGGAAATTC 
GGCAAAATTT 



ATCCTTTTAT 
TCATTGAAAT 
ACTTTGGTGA 
AAGATTAAGA 
TTTTGTGCCA 
TACCATTTAG 



TGATTACTCG 
CGAGCATATG 
GCAACTCAGC 
CTCAGGAGAT 
TCCAGACAAA 
TTGGTTTAGT 
CATCAATATG 
CACCTACCAC 
AGCACTTCGC 
CAAAACACTG 
CAATACACCA 
TCAGAATCAG 
CAAAGAACAT 
CATGCAAGTT 
GAAATACCGA 



CCTGGAGATC CTGGTCCTCC 
GACAAGGGTG 
GGCCCACCAG 
CAAGGTGAAA 
CCAGTCGGTC 
GGTCCTGTGG 
ATGGGACCTC CTGGCTTACC 
CATCCTGGTT TAATTGGCCT 
GGGCTCCCTG GAACTCAAGG 
GCTGGTCCCT TAGGTCCACC 
CTACTGGACC 
GTCCACCTGG 
ATACTGAAGG 



TCCTGGTCAA CCGGGTCCTC 
AAAACGAGGT CCTCCTGGAG 
GGGGGAAGCA GGTGCAGAAG 



414 0 
4260 



TCTCCCTGGA 
TGGTCTCAAA 
GATTGGTCCT 



GGGCCTCCAG 



GATGGAATGG 



CATCCTGACT 
TCCTTCAAAG 
AAATCTGAGG 
GAATTTAAGA 
GTGCAAATGA 



TGGGTACTCA 



CGCTGGCCAG 
TGAAGTCATT 
CATGCAAGCA 
TGG7TCCCTC 
GACCAATCCA 



TTTACTGTAA 
GAGTAAGAAT 
GGGGAAAACT 



TTCCTGGGAT 



AAAATTGATC 
AAGTTCGGAT 
ATCAAATCAA 
TTGAATAAGG 
TGCCTTTGTG 



TTCTCAACTC TCCTTTTCCT A 

AATATATATT CATAAAAAAT A 

TGTGTTTAAT AAATTGTAAT T 

CCAAAACTTG CACGTGTCCC T 

GATGGCAATA ATATATGTAT T 



CAGCAGCCTG 
CAAATGATGA 
GTACGTCCAG 
AAGTACCTAT 
TTGAAGTTGG 
CAGAAAATGT 
ATGTATGGAA 
GGGGCAGAAT 
ATGGAAACAG 
CTTTGGTGCT 
TTCTCATCCA 
ACAGTTCTAT 
TGACTCTAAT 
AAGTTATGAT 
GTGTGTTT 



TTTCACATCT 
TTCATCA7GG 
GCTTTCATAC 
ACTTCTGACT 
GTATGATGTG 
GGAGATGTCC 
AAAAGGCTAT 
TGTTGATGTC 
TCCTGTTTGT 
ACCTTGGTGC 
AACAACGCTG 
CACAGACAAA 
GGCTGATTCT 
GTAGAAAACA 
TCCAGGATGT 
ACTGTTATCT 
TTATGAGGAT 
TTCCGATGAC 



G CAAAAGGGG 
GGCTTACCAG 
AAAGGTGACA 
CAGCCTTTAC 
GATGCAGATG 
AATTCCCTGA 
GCCCGAACTT 
ATTGATCCTA 
GGTGGTGAGA 
CCAAAGGAGA 
TTAGATGTTG 
GCCTCTGCTC 



TATGACAATA 
GAAAAAACTG 
ATGATCAGTG 
TTTCTTGGCT 
CACCAACCCA 
CATATACAGG 
AGCTTTGAAA 
TGATTCCCAA 
AAAAAAGAAA 
ACTAAAACAG 
GTGTCCATTT 
GCCGAACTCT 
CCTAAGTCCC 



5460 
5520 
5580 
5S40 
5700 

5820 
5880 
5940 



MEPWSSRWKT 
CTNRKNSKGS 
EHGIQQIGVE 



GFDGLPGLPG 
GAPGQPGMAG 
GLAGLPGADG 
GEDGFPGFKG 
GLPGYPGRQG 
GPKGTSGGDG 
GPPGPGGWG 
GPAGLRGPPG 
GPPGPAGEKG 
GGKGENGPPG 
GLPGPPGEKG 
GNPGPPGEAG 
GELGPAGQDG 
GEAGAEGPPG 



TLALTFLFQA 

DTAYRVSKQA QLSAPTKQLF 

VGRSPVFLFE DHTGKPAPED YPLFRTVNIA 

DRSEEAIVDT NGITVFGTRI LDEEVFEGDI 

AQAQEPQIDE YAPEDIIEYD YEYGEAEYKE 

YGTMESYQTE APRHVSGTNE PNPVEEIFTE 

DLLVDGDLGE YDFYEYKEYE DKPTSPPNEE 

PAWEPGMLV EGPPGPAGPA GIMGPPGLQG 

TMLMLPFRYG GDGSKGPTIS AQEAQAQAIL 

AKGESGDPGP QGPRGVQGPP GPTGKPGKRG 

DKGHRGERGP QGPPGPPGDD GMRGEDGEIG 



PPGHPGKEGQ SGEKGALGPP GPQGPIGXPG 



VLKALDFHNS PEGISKTTGF 
IQSFLLSIYN 
V3KK7VTMIV 



QQARIALRGP 
RPGADGGRGM 
PRGLPGEAGP 
LPGPQGPIGP 
PRGVKGADGV 



PGPMGLTGRP 
PGEPGAKGDR 
RGLLGPRGTP 
PGEKGPQGKP 
RGLKGSKGEK 



PGANGEKGAR GVAGKPGPRG 
QGPQGPVGFP GPKGPPGPPG 
IGERGYPGPP GPPGEQGLPG 

ERGSAGTAGP I GLRGRPGPQ 

SPGEDGDKGE IGEPGQKGSK 

QKGDEGARGF PGPPGPIGLQ 

PGPPGPRGPQ GPNGADGPQG PPGSVGSVGG VGEKGEPGEA 

KGEAGPPGAA GPPGAKGPPG DDGPKGNPGP VGFPGDPGPP 

_ r rbD PGQPGPPGPS GEAGPFGPPG KRGPPGAAGA 3GRQGEKGAK 

GI PGP VGEQG LPGAAGQDGP PGPMGPPGbP 



ERGLPGAQGA 
APGEKGPQGP 
PPGLQGPVGA 
ENGDVGPWGP 



KTGPVGPQGP 



GPPGLPGPQG PKGNKGSTGP 
MQADADDNIL DYSDGMEEIF 
EYWIDPNQGC SGDSFKVYCN 
LSYLDVEGNS INMVQMTFLK 
EMSYDNNPFI KTLYDGCTSR 
PVCFLG 



IGPPGEQGEK GDRGLPGTQG SPGAKGDGGI PGPAGPLGPP 
AGQKGDSGLP GPPGPPGPPG 
GSLNSLKQDI EHMKFPMGTQ 
FTSGGETCIY PDKKSEGVRI 
LLTASARQNF TYHCHQSAAW 
KGYEKTVIEI NTPKIDQVPI 



1020 
1080 

1200 
1260 

1380 

1500 
1560 
1520 



TTCCCCAGCA TTCGAGAAAC TCCTCTCTAC TTTAC-CACGG TCTCCAfflCT CAGCCGAGAG 
ACAGCAAACT GCAGCGCGGT GAGAGAGCGA GAGAGAGGGA GAGAGAGACT CTCCAGCCTG 
GGAACTATAA CTCCTCTGCG AGAGGCGGAG AACTCCTTCC CCAAATCTTT TGGGGACTTT 
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TCTCTCTTTA CCCACCTCCG CCCCTGCGAG GAGTTGAGGG GCCAG7TCGG CCGCCC-CGCG 240 

CGTCTTCCCG TTCGGCGTGT GCTTGGCCCG GGGAACCGGG AGGGCCCGGC GATCGCGCGG 300 

CGGCCGCCGC GAGGGTGTGA GCGCGCGTGG GCGCCCGCCG AGCCGAGGCC ATGGTGCAGC 360 

AAACCAACAA TGCCGAGAAC ACGGAAGCGC TGCTGGCCGG CGAGAGCTCG GACTCGGGCG 420 

CCGGCCTCGA GCTGGGAATC GCCTCCTCCC CCACGCCCGG CTCCACCGCC TCCACGGGCG 480 

GCAAGGCCGA CGACCCGAGC TGGTGCAAGA CCCCGAGTGG GCACA7CAAG CGACCCATGA 540 

ACGCCTTCAT GGTGTGGTCG CAGATCGAGC GGCGCAAGAT CATGGAGCAG TCGCCCGACA S00 

TGCACAACGC CGAGATCTCC AAGCGGCTGG GCAAACGCTG GAAGC7GCTC AAAGACAGCG S60 

ACAAGATCCC TTTCATTCGA GAGGCGGAGC GGCTGCGCCT CAAC-CACATG GCTGACTACC 720 

CCGACTACAA GTACCGGCCC AGGAAGAAGG TGAAGTCCGG CAACGCCAAC TCCAGCTCCT 780 

CGGCCGCCGC CTCCTCCAAG CCGGGGGAGA AGGGAGACAA GGTCGGTGGC AGTGGCGGGG 840 

GCGGCCATGG GGGCGGCGGC GGCGGCGGGA GCAGCAACGC GGGGGGAGGA GGCGGCGGTG 900 

CGAGTGGCGG CGGCGCCAAC TCCAAACCGG CGCAGAAAAA GAGCTGCGGC TCCAAAGTGG 960 

CGGGCGGCGC GGGCGGTGGG GTTAGCAAAC CGCACGCCAA GCTCA7CCTG GCAGGCGGCG 1020 

GCGGCGGCGG GAAAGCAGCG GCTGCCGCCG CCGCCTCCTT CGCCGCCGAA CAGGCGGGGG 1080 

CCGCCGCCCT GCTGCCCCTG GGCGCCGCCG CCGACCACCA CTCGCTGTAC AAGGCGCGGA 1140 

CTCCCAGCGC CTCGGCCTCC GCCTCCTCGG CAGCCTCGGC CTCCGCAGCG CTCGCGGCCC 1200 

CGGGCAAGCA CCTGGCGGAG AAGAAGGTGA AGCGCGTCTA CCTGTTCGGC GGCCTGGGCA 1260 

CGTCGTCGTC GCCCGTGGGC GGCGTGGGCG CGGGAGCCGA CCCCAGCGAC CCCCTGGGCC 1320 

TGTACGAGGA GGAGGGCGCG GGCTGCTCGC CCGACGCGCC CAGCCTGAGC GGCCGCAGCA 1380 

GCGCCGCCTC GTCCCCCGCC GCCGGCCGCT CGCCCGCCGA CCACCGCGGC TACGCCAGCC 1440 

TGCGCGCCGC CTCGCCCGCC CCGTCCAGCG CGCCCTCGCA CGCGTCCTCC TCGGCCTCGT 1500 

CCCACTCCTC CTCTTCCTCC TCCTCGGGCT CCTCGTCCTC CGACGACGAG TTCGAAGACG 1560 

ACCTGCTCGA CCTGAACCCC AGCTCAAACT TTGAGAGCAT GTCCC7GGGC AGCTTCAGTT 1620 

CGTCGTCGGC GCTCGACCGG GACCTGGATT TTAACTTCGA GCCCGGCTCC GGCTCGCACT 1680 

TCGAGTTCCC GGACTACTGC ACGCCCGAGG TGAGCGAGAT GATCTCGGGA GACTGGCTCG 1740 

AGTCCAGCAT CTCCAACCTG GTTTTCACCT ACTGAAGGGC GCGCAGGCAG GGAGAAGGGC 1800 

CGGGGGGGGT AGGAGAGGAG AAAAAAAAAG TGAAAAAAAG AAACGAAAAG GACAGACGAA 1860 

GAGTTTAAAG AGAAAAGGGA AAAAAGAAAG AAAAAGTAAG CAGGGCTCG- TCGCCCGCGT 192 0 

TCTCGTCGTC GGATCAAGGA GCGCGGCGGC GTTTTGGACC CGCGCTCCCA TCCCCCACCT 1980 

TCCCGGGCCG GGGACCCACT CTGCCCAGCC GGAGGGACGC GGAGGAGGAA GAGGGTAGAC 2040 

AGGGGCGACC TGTGATTGTT GTTATTGATG TTGTTGTTGA TGGCAAAAAA AAAAAGCGAC 2100 

TTCGAGTTTG CTCCCCTTTG CTTGAAGAGA CCCCCTCCCC CTTCCAACGA GCTTCCGGAC 2160 

TTGTCTGCAC CCCCAGCAAG AAGGCGAGTT AGTTTTCTAG AGACTTGAAG GAGTCTCCCC 2220 

CTTCCTGCAT CACCACCTTG GTTTTGTTTT ATTTTGCTTC TTGGT CAAG A AAGGAGGGGA 228 0 

GAACCCAGCG CACCCCTCCC CCCCTTTTTT TAAACGCGTG ATGAAGACAG AAGGCTCCGG 2340 

GGTGACGAAT TTGGCCGATG GCAGATGTTT TGGGGGAACG CCGGGACTGA GAGACTCCAC 2400 

GCAGGCGAAT TCCCGTTTGG GGCCTTTTTT TCCTCCCTCT TTTCCCCTTG CCCCCTCTGC 2460 

AGCCGGAGGA GGAGATGTTG AGGGGAGGAG GCCAGCCAGT GTGACCGGCG CTAGGAAATG 2520 

ACCCGAGAAC CCCGTTGGAA GCGCAGCAGC GGGAGCTAGG GGCGGGGGCG GAGGAGGACA 2580 

CGAACTGGAA GGGGGTTCAC GGTCAAACTG AAATGGATTT GCACGTTGGG GAGCTGGCGG 2640 

CGGCGGCTGC TGGGCCTCCG CCTTCTTTTC TACGTGAAAT CAGTGAGGTG AGACTTCCCA 270 0 

GACCCCGGAG GCGTGGAGGA GAGGAGACTG TTTGATGTGG TACAGGGGCA GTCAGTGGAG 2760 
GGCGAGTGGT TTCGGAAAAA AAAAAAGAAA AAAAGGG 

n sequence 



I I I ' I I I 

MVQQTNNAEN TEALLAGESS DSGAGLELGI ASSPTPGSTA STGGKADDPS WCKTPSGHIK 
RPMNAFMVWS QIERRKIMEQ SPDMHNAEIS KRLGKRWKLL KDSDKIF " 
ADYPDYKYRP RKKVKSGNAN SSSSAAASSK PGEKGDKVGG SGGGGHGGGG G 
GGGASGGGAN SKPAQKKSCG SKVAGGAGGG VSKPHAKLIL AGGGGGG 

QAGAAALLPL GAAADHHSLY KARTPSASAS ASSAASASAA LAAPGKHLAE KKVKRVYLFG 3 00 

GLGTSSSPVG GVGAGADPSD PLGLYEEEGA GCSPDAPSLS GRSSAASSPA AGRSPADHRG 360 

YASLRAASPA PSSAPSHASS SASSHSSSSS SSGSSSSDDE FEDDLLDLNP SSNFESMSLG 420 
SFSSSSALDR DLDFNFEPGS GSHFEFPDYC TPEVSEMISG DWLESSISNL VFTY 

Seq ID NO: 364 DNA sequence 
Nucleic Acid Accession it: U10 860 
Coding sequence: 123-2204 

1 11 21 31 41 51 

I I I I I 

TGCCGGCTGC TCCTCGACCA GGCCTCCTTC TCAACCTCAG CCCGCGGCGC CGACCCTTCC 60 

GGCACCCTCC CGCCCCGTCT CGTACTGTCG CCGTCACCGC CGCGGCTCCG GCCCTGGCCC 12 0 

CGATGGCTCT GTGCAACGGA GACTCCAAGC TGGAGAATGC 7GGAGGAGAC CT7AAGGATG 180 

GCCACCACCA CTATGAAGGA GCTGTTGTCA TTCTGGATGC 7GGTGCTCAG TACGGGAAAG 24 0 

TCATAGACCG AAGAGTGAGG GAACTGTTCG TGCAGTCTGA AATTTTCCCC TTGGAAACAC 300 

CAGCATTTGC TATAAAGGAA CAAGGATTCC GTGCTATTAT CATC7CTGGA GGACCTAATT 360 

CTGTGTATGC TGAAGATGCT CCCTGGTTTG ATCCAGCAAT ATTCACTAT7 GGCAAGCCTG 420 

TTCTTGGAAT TTGCTATGGT ATGCAGATGA TGAATAAGGT ATT7GGAGG7 ACTGTGCACA 480 

AAAAAAGTGT CAGAGAAGAT GGAGTTTTCA ACATTAGTGT GGATAATACA TGTTCATTAT 540 

TCAGGGGCCT TCAGAAGGAA GAAGTTGTTT TGCTTACACA TG3AGATAG7 GTAGACAAAG 600 

TAGCTGATGG ATTCAAGGTT GTGGCACGTT CTGGAAACAT AGTAGCAGGC ATAGCAAATG 660 

AATCTAAAAA GTTATATGGA GCACAGTTCC ACCCTGAAGT TGGCCTTACA GAAAATGGAA 72 0 

AAGTAATACT GAAGAATTTC CTTTATGATA TAGCTGGATG CAGTGGAACC TTCACCGTGC 780 

AGAACAGAGA ACTTGAGTGT ATTCGAGAGA TCAAAGAGAG AGTAGGCACG TCAAAAGTTT 840 

TGGTTTTACT CAGTGGTGGA GTAGACTCAA CAGTTTGTAC AGC777GC7A AATCGTGCTT 900 

TGAACCAAGA ACAAGTCATT GCTGTGCACA TTGATAATGG CTTTATGAGA AAACGAGAAA 96 0 

GCCAGTCTGT TGAAGAGGCC CTCAAAAAGC TTGGAATTCA GGTCAAAGTG ATAAATGCTG 102 0 

CTCATTCTTT CTACAATGGA ACAACAACCC TACCAATATC AGATGAAGAT AGAACCCCAC 1080 

GGAAAAGAAT TAGCAAAACG TTAAATATGA CCACAAGTCC TGAAGAGAAA AGAAAAATCA 1140 

C TTTTGTTAAG ATTGCCAATG AAGTAATTGG AGAAATGAAC TTGAAACCAG 1200 

CCTTGCCCAA GGTACTTTAC GGCCTGATCT AATTGAAAGT GCATCCCTTG 1260 
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WO 02/086443 

TTGCAAGTGG CAAAGCTGAA 
A GGAGGGAAAA 



CTCATCAAAA 



AAATTACCAG TCTGCATTCA 
AGGGTGACTG TCGTTCCTAC 
GGGAATCACT TATTTTTCTG 
TTGTTTATAT ATTTGGCCCA 



GGACTTCCAG 
GTAATATGTG 
AAAATAGTAG 
AAAGCCTGCA 



AGTTACGTGT 



CCCATCACAA TGACACAGAG CTCATCAGAA 
CTCTGAAAGA TTTTCATAAA GATGAAGTGA 
AAGAGTTAGT TTCCAGGCAT CCATTTCCAG 
CTGAAGAACC TTATATTTGT AAGGACTTTC 
CTGATTTTTC TGCAAGTGTT AAAAAGCCAC 
CAACAGAAGA GGATCAGGAG AAC-CTGATGC 
AATTAAAACT GTAGGTGTGC 
CAGTAAAGAT GAACCTGACT 
TACCTCGCAT GTGTCACAAC GTTAACAGAG 
AGATGTTACT CCCACTTTCT 



3 GTATGCTGGG 
TTGATCGGGA CCCACTTCAA 
TTATTACTAG TGACTTCATG 
AGGTGGTATT AAAGATGGTC 
Z ATCAAAGCCC 



AAGCAGCCTT 
ACTGGTATAC 
ACTGAGATTA 
CCAGGAACTA 



AGATGCCGGT GATTTTGACA CCATTACA7T 
CATGCCAGAG ATCTGTGGTT ATTCGAACCT 
CTGCAACACC TGGCAATGAG ATCCCTGTAG 
AGAAGATTCC TGGTATTTCT CGAATTATGT 
CTGAGTGGGA G7AATAAACT TC 



2040 
2100 
21S0 



PCT/US02/12476 



I 

MALCNGDSKL 
AFAIKEQGPR 
KSVREDGVFN 
SKKLYGAQFH 
VLLSGGVDST 
HSFYNGTTTL 
EVFLAQGTLR 
ILGRELGLPE 
TLLQRVKACT 
ESLIFLARLI 



I 



AIIISGGPNS 
ISVDNTCSLF 
PEVGLTENGK 
VCTALLNRAL 
PISDEDRTPR 
PDLIESASLV 
ELVSRHPFPG 
TEEDQEKLMQ 
PRMCHNVNRV 



WLKMVTEIK KIPGISRIMY 



VILKNFLYDI 
NQEQVIAVHI 
KRISKTLNMT 
ASGKAELIKT 
PGLAIRVICA 
ITSLHSLNAF 
VYIFGPPVKE 
DRDPLOKQPS 
DLTSKPPGTT 



PAIFTIGKPV 
LTHGDSVDKV 
AGCSGTFTVQ 



I I 
IDRRVRELFV QSEIFPLETP 
LGICYGMQMM NKVFGGTVHK 
ADGFKWARS GNIVAGIANE 
NR3LECIREI KERVGTSKVL 
QSVEEALKKL GIQVKVINAA 



EEPYICKDFP 
LLP I KTVGVQ 
PPTDVTPTFL 
CQRSWIRTF 



LR3EGKVIEP LKDFHKDEVR 
ETNNILKIVA DFSASVKKPH 
GDCRSYSYVC GISSKDEPDW 
TTGVLSTLRQ ADPEAHNILR 
ITSDFMTGIP ATPGNEIPVE 



GCGGCCTCAG 
TATGTTGATA AGGAAAATGG 
CTGGGGTCTG GACCTTCAAT 
A CGTTCGATGC 



CTGTTAAGAC 
AGAACCAGGC 
CAAAGCCTTA 
CCCACCAGCC 
AAflGTCTGTA 
GATGACTGAG 



CTGCAATAAT 

1, 71 
GATGGGAGAT 
TTACCTAAAG 



: TACTCTGATC 



CCTGTGAAGA 1 



TTGACGAGGA 



AGAGCACCAG 
GAGAGAGCTT 
ACCATGGGAA 



CTACTAGAAA 
GACCCCTCAA 
AAGCAAAAAG 
CCTTCAATCC 
TCCCCTTGAG 



AACACCACGT 
GGCTTTGGGA 
ACAAAAACAG 
CTCTGTTCCT 
TCTAGACTTT 
TGGAGTGCCT 



TCCAATCTGT 
GTTTGCTGTG 
TAATAAAGCA 



TGCAGTGTCC TTCAAGCATT 
ACATAGATAT TTAAATTTCT 
TTCTTCAACA GAAAAAAAAA 



Protein Accession # 



MATLIYVDKE NGEPGTRWA KDGLKLGSGP SIKALDGRSQ VSTPRFGKTF DAPPALPKAT 

RKALGTVNRA TEKSVKTKGP LKQKQPSFSA KKMTEKTVKA KSSVPA3DDA YPEIEKFFPF 

NPLDFESFDL PEEHQIAHLP LSGVPLMILD EERELEKLFQ LGPPSPVKMP SPPWESNLLQ 
SPSSILSTLD VELPPVCCDI DI 



ATTCGGGGCG 
CCTGCCCGCC 
CTGCCGAGAG 
CCGCTGCTGC 
CTGTTCCGCT 



CGCCCGCTCG 



GTCCGGGAGC C 
GGCGTCTACA C 
CTGCCCCTGC A 



A AGTCGGGTAT 
C CTGCCAGGAC 



I I 
AAGAAGCGGA GGAGGCGGCT 
CTCGCTCGCC CGCCGCGCCG 
CGCGCTGCCG CTGCCGCCGC 
GGGCGCGAGT GGCGGCGGCG 
CACACCCGAG CGCCTGGCCG 
CGCAGTGGCC GGAGGCGCCC 
CTGCTGCTCG GTGTGCGCCC 
CGGCCAGGGG CTGCGCTGCT 
CATGGGCGAG GGCACTTG1G 
GGTTGCAGAC 
CACCATGAAC 
GAAGGAGCTG GCCGTGTTCC 
TGGCAAGCAT CACCTTGGCC 
TCCCTGCCAA CAGGAACTGG 



GCGGGGCGCG CGCGGAGGTG 
CCCGCCGGTT 
CGCGGAGCTC 
GGCTGGAGGG CGAGGCGTGC 
ATCCCCACCC GGGCTCCGAG 
AGAAGCGCCG GGACGCCGAG 
ACCACTCAGA AGGAGGCCTG 
GGGGAGGCAG TGCTGGCCGG 
GGGAGAAGGT CACTGAGCAG 
TGGAGGAGCC CAAGAAGCTG 
ACCAGGTCCT GGAGCGGATC 
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CAGCGTGGGG AGTGCTGGTG 
ACCATCCGGG GGGACCCCGA 
GTGCACACCC AGCGGATGCA 
GCCCCTCTCC AAACACCGGC 



CCTGTACAAC 
TGTGAACCCC 
GTGTCATCTC 
GTAGACCGCA 
AGAAAACGGA 



CCTCTGGAGC 
CTCAAACAGT 
AACACCGGGA 
TTCTACAATG 



CCCGGCCTCT CTCTTCCCAG 
GAGGAAGGGG GTTGTGGTCG 
TTTATTTTTG AACCCCTGTG 



CTGCAGATGC 
GGGAGCTGGG 
TCCCTTTTGC 



3 GAAAGAGACC 



ACCTCTACTC CCTGCACATC 
GCAA3ATGTC 7CTGAACGGG 
AGCTGATCCA GGGAGCCCCC 
AGCAGCAGGA GGCTTGCGGG 
GCCTGGCGCC CCTC-CCCCCC 
C-TGGTC-GGTG CTGGAGGATT 
AGCACCGAGC TCGGCACCTC 
CCTTCTTGCT TTCCCCGGGG 



CACACCTGCT 
GTACAGGTTT 
ATAAGATTAA AGGAAGGAAA AGT 



PCT/US02/12476 



I 



I 



MLPRVGCPAL PLPPPPLLPL LPLLLLLLGA SGGGGGARAE VLFRCPPCTF ERLAACGPPP 
VAPPAAVAAV AGGARMPCAE LVREPGCGCC SVCARLEGEA CGVYTPRCGQ GLRCYPHPGS 
ELPLQALVMG EGTCEKRRDA EYGASPEQVA DNGDDHSEGG LVENHVDSTM NMLGGGGSAG 
20 RKPLKSGMKE LAVFREKVTE QHRQMGKGGK HHLGLEEPKK LRPPPARTPC QQELDQVLER 
ISTMRLPDER GPLEHLYSLH IPNCDKHGLY NLKQCKMSLN GQRGECWCVN PNTGKLIQGA 
PTIRGDPECH LPYNEQQEAC GVHTQRMQ 

Seq ID NO: 3 70 DNA sequence 
25 Nucleic Acid Accession #: NM_004264 
Coding sequence: 6-440 



GGAACATGGC G 



45 
50 
55 



AGACAGCAAT 
CAGCACTGAT 
AAGAATCTAC 
AAGCTCCTAC 
AAAGCGCACT 
AGTCTCTTCC 
GTGCCATTAA 
TTAAACACTA 
GATAAGCTTA 
GAGTGAAATT 
AATTCTGTTA 



TGCACGAACA 



ATGTGTGGAG 
TGCTGATATT 
AGACTCATAG 
GAATTCTGCA 
TGACACATTA 
TAAATCATGA 
ATTAAGGCAT 
TGACATAATT 



CAGCCAGCTA 
GCAAAAGACA 
CAGGCTGCTA 



GCACAGTCAC 
CATCAGTGGA 
TCAGACTTAG 
CCTTTTTAGC 
TTGAATCAGC 
GTAATACATT 
TATGTCTCCA 



GTGGTCCTCC 
ACCCTACAGA 
TTGATGTTTT 
GCTTGTATAA 
ATCGAGGAGA 
AGCTGAAGAC 
TACCATGTGG 
ATACAAGCCT 
TATTTTTAAT 
TTTAAAGCAT 



GAATTCGCTT GCAGATCAGT 
TGCCTCTTTC AATAATATTC 
AGAGTATGCC CAGCTTTTTG 
GATAGATTCC TTACCCAGTG 
GCTAGAAGAA G 



TTTTGTTGTA 



ACCCATAGCC 
CTGAGAAAAG AACTGTTTGA 
TACCAACAAT TACAGAAACA 
AGTCTTCTAT TTTCACTCTT 
CATACCATCA TTTTTTAACT 
ATATAAGGAA ACATATGTAA 
TTGGCCAGTA CTTTTACAAT 



Seq ID NO i 371 Protein 



MADRLTQLQD AVNSLADQFC NAIGVLOOCG PPASFNNIQT AINKEQPANE TEEYAQLFAA 
LIARTAKDID VLIDSLPSEE STAALQAASL YKLEEENHEA ATCVEDWYR GDMLLEKIQS 
ALADIAQSQL KTRSGTHSQS LPDS 



CTGCGCGTGG 
CATTTCAAAG 
TTCTTAGACC 
ACAGTACAGA 
CTGTTTTTGG 
AGAGCTAAGG 
ACTCTTACAA 
TTCTCCTGGA 
1ATGACACAT 
GAAACTATCA 
CTTCTTGGAA 
AAAGCTGTGG 
TTCTACATGC 
CTGTGGATTC 
ATTCCAATAT 
AAAGTTAGAT 
ATAAATTTTC 
CATGCCTGTG 

Seq ID NO: 



AGGTGTTGAC 

CTCAAGGACA 
TTGTGAAACC AGAGCCTGTT 



r TGATCGTTGG 



ACTTAAGGAA 
TCTTTGTCAA 
TCCATACTGT 



GAAATTTTAT 



AGGATACCTG 
CCTGACTGIG 
GGCTGACATG 
TGGAGTCACT 
TTTGTTTATC 
GTTTTATTTG 
TGACATGGAT 



I I I 

TACTGGGCTC AGCGACACCG CGAGCTATAT 
CCIGCCATCA GCATCACTGA AAACGTGCTG 
GGAGACAATG TCTATGAATT TCACCTGGAG 
TACAAACTGA CCCAGAGGCA GGTAAACATT 
GAGAGACTCA CAAAGCT 
CTGGATGAAT CTGATGCGGA A 
CTCCGACTGG AAAGCGAAGG CTCTCCTGAA 



CGGACGATTC 
TCTTCAGATT 
TAAACAGCGC 
TTTGGGAGGC 



ATCTTTGGCA 
TGGAGTGCAA 
TGGAAGGTGC 
TTGGCGGAAG 
AGTTTCACAT 
TATCTTATAA 
AGACTGAAAA 
TGA 



TCTTGGGAAA 
GCCAGATGCT 
TGCTGCCTTC 



TTGAAATTTT 
TCACATGGCT 
CTGTCTCAGT 
TGCCATATGC 
TGATAT7TTT 
TGAGGGCAGG 



TCTGATCCAG 
AATGCAGAAC 
CAGGTACTCT 
TCGTTACACT 
GATTCAGTCC 
AGTGAAAATC 
AGGTTTATAC : 



MENQVLTPHV YWAQRHRELY LRVELSDVQN PAISITEtT/L HFKAQGHGA1 GDNVYSFHLE 
FLDLVKPEPV YKLTQRQVNI TVQKKVSQWW ERLTKQEKRP LFIiAPDFDRW LDESDAEMEL 
RAKEEERLNK LRLESEGSPE TLTNLRKGYL FMYNLVQFLG FSWIFVNLTV RFCILGKESF 



325 



PCT/US02/12476 



60 
65 



WKVLTWLRYT LWIPLYPLQC LA3AVSVIQS 300 
INFRHLYKQR P.LKMRAGAVA 360 



I I I 

C AGGTGTTGAC GCCGCATGTC TACTGGGCTC AGCGACACC3 CGAC-CTATAT 

CTGCGCGTGG AGCTGAGTGA CGTACAGAAC CCTGCCATCA GCATCACTGA AAACGTGCTG 
15 CATTTCAAAG CTCAAGGACA TGGTGCCAAA GGAGACAATG TCTATGAATT TCACCTGGAG 
TTCTTAGACC TTGTGAAACC AGAGCCTGTT TACAAACTGA CCCAGAGGCA GGTAAACATT 
ACAGTACAGA AGAAAGTGAG TCAGTGGTGG GAGAGACTCA CAAAGCAGGA AAAGCGACCA 
CTGTTTTTGG CTCCTGACTT TGATCGTTGG CTGGATGAAT CTGATGCGGA AATGGAGCTC 
AGAGCTAAGG AAGAAGAGCG CCTAAATAAA CTCCGACTGG AAAGCGAAGG CTCTCCTGAA 
20 ACTCTTACAA ACTTAAGGAA AGGATACCTG TTTATGTA7A ATCTTGTGCA ATTCTTGC-GA 
TTCTCCTGGA TCTTTGTCAA CCTGACTGTG CGATTCTG7A TCTTGGGAAA AGAGTCCTTT 
TATGACACAT TCCATACTGT GGCTGACATG ATGTATTTCT GCCAGATGCT GGCAGTTGTG 
GAAACTATCA ATGCAGCAAT TGGAGTCACT ACGTCACCGG TGCTGCCTTC TCTGATCCAG 
CTTCTTGGAA GAAATTTTAT TTTGTTTATC ATCTTTGGCA CCATGGAAGA AATGCAGAAC 
25 AAAGCTGTGG TTTTCTTTGT GTTTTATTTG TGGAGTGCAA TTGAAATTTT CAGGTACTCT 
TTCTACATGC TGACGTGCAT TGACATGGAT TGGAAGGTGC TCACATGGCT TCGTTACACT 
CTGTGGATTC CCTTATATCC ACTGGGATGT TTGGCGGAAG CTGTCTCAGT GATTCAGTCC 
ATTCCAATAT TCAATGAGAC CGGACGATTC AGTTTCACAT TGCCATATCC AGTGAAAATC 
AAAGTTAGAT TTTCCTTTTT TCTTCAGATT TATCTTATAA TGATAT7TTT AGGT7TATAC : 
30 ATAAATTTTC GTCACCTTTA TAAACAGCGC AGACTGAAAA TGAGGGCAGG CGCAGTGGCT : 
CATGCCTGTG ATCCCAGCGC TTTGGGAGGC TGA 

r)e Seq ID NO: 3 75 Protein sequence 

1 11 21 31 41 51 

MENOVLTPHV YWAQRHRELY LRVELSDVQN PAISITEtJVL HFKAQGHGAK GDNVYEFHLE 

FLDLVKPEPV YKLTQRQVNI TVQKKVSOWW ERLTKQEKRP LFLAPDFDRW LDESDAEMEL 

RAKEEERLNK LRLESEGSPE TLTNLRKGYL FMYNLVQFLG FSWIFVNLTV RFCILGKESF 

YDTFHTVACM MYFCQMLAW ETINAAIGVT TSPVLPSLIQ LLGRNFILFI I FGTMEEMON 

ICAWFFVFYL WSAIEIFRYS FYMLTCIDMD WKVLTWLRYT LWIPLYPLGC LVEAV3VIQS 

XPIFNETGRF SFTLPYPVKI KVRFSFFLQI YLIMIFLGLY INFRHLYKQR RRRYGKKRKR 
STKKKDLDGF LPV 

Seq ID NO: 37S DNA sequence 



1 I I I I 

ATGAATTCTC AGCAGCAGAA GCAGCCTTGC ACCCCACCCC CTCAGCCTCA G 
GTGAAACAAC CTTGCCAGCC TCCACCCCAG GAACCATGCA TCCCCAAAAC CAAGGAGCCC 
TGCCAACCCA AGGTGCCTGA GCCCTGCCAC CCCAAAGTGC CTGAGCCCTG CCAGCCCAAG 
ATTCCAGAGC CCTGCCAGCC CAAGGTGCCT GAGCCCTGCC CTTCAACGGT CACTCCAGCA 
CCAGCCCAGC AGAAGACCAA GCAGAAGTAA 



Seq ID NO: 378 DNA 
Nucleic Acid Accession 
Coding sequence: 74-505 

1 11 21 31 41 51 

| | I I I 

A CACTGCGGCG GGCGTCTGTT CTAGTGTTTG AGCCGTCGTG CTTCACCGGT 

T AGCATGTCGG GCCGCGGCAA GACTGGCGGC AAGGCCCGCG CCAAGGCCAA 

GTCGCGCTCG TCGCGCGCCG GCCTCCAGTT CCCAGTGGGC CGTGTACACC GGCTGCTGCG 
GAAGGGCCAC TACGCCGAGC GCGTTGGCGC CGGCGCGCCA GTGTACCTGG CGGCAGTGCT 
GGAGTACCTC ACCGCTGAGA TCCTGGAGCT GGCGGGCAAT GCGGCCCGCG ACAACAAGAA 
GACGCGAATC ATCCCCCGCC ACCTGCAGCT GGCCA7CCGC AACGACGAGG AGCTCAACAA 
GCTGCTGGGC GGCGTGACGA TCGCCCAGGG AGGCGTCCTG CCCAACATCC AGGCCGTGCT 
GCTGCCCAAG AAGACCAGCG CCACCGTGGG GCCGAAGGCG CCCTC GCG 3 AGA? 
CACCCAGGCC TCCCAGGAGT ACTAAGAGGG CCCGCGCCGC GGCCGGCCGC CCCAGCTCCC 
CATGCCACCA CAAAGGCCCT TTTAAGGGCC ACCACCGCCC TCA7GGAAAG AGCTGAGCCG 
CTTCAGACTG CGGGGCAAGC GGGCCGCGGC TCCCTTCCCC TCCCCTCCCC TCGCCCGCCT 
TCGCCGCCCG GCCTCGAGTC CCCGCCCGCC CCCGCTCCCG TCCCGCACCG CCTGCCGCGT 
CGGCCTCGGG CCTGCCCTGT CCGCCGTCCG CCCTCCGGTA GGG7TCGGGC CTTCCGGATG 
CGGCTTGGGC GCTCTTCGGG GACCTCCGTG GCGCGGAAGA CCCGAGCCTG CCG3GGGGAG 
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GCCGGCGGCG CCGCACCTGC CCGCCTCGGC GTTCGTGACT CAGCCGCCCC ATCCCGAGTC 900 

GCTAAGGGGC TGCGGGGAGG CCGCAGCACC TTCTGGAAGA CTTGGCCTTC CGCTCTGACG 960 

CAGGGCCGAG GTGGGCAGTC CAGGCCGAGA GCCGGCGGCC CTGAAGGTGA GTGAGGCCCT 1020 

CGGCAGCTGC AGCCGGGGTG TCTGGTACCC CCCCGGCGTG GTGCTTAGCC CAGGACTTTC 10 80 

AGACGGCCGC TGGCCGGGAG GCTTTGGTGG GAGAGACGCG ATCGCCGATT 7CGGTCTGGC 1140 

GCCCCTTCTG cg qcCGGGAC CCAGGCCTTT CACATCAGCT CTCCCTCCAT CTTCATTCAT 1200 

AGGTCTGCGC TGGGGCCGGG ACGAAGCACT TGGTAACAGG CACATCTTCC TCCCGAGTGA 1260 

CTGCCTCCTA GGAGGACATT TAGGGGAGGG CAGAGGCCTG CAGTTTGGCT TCACGGCTGG 1320 

CTATGTGGAC AGCAAGAGTC GTTTTGCGGA ACGCGACTGG CAGCCAGGCC TGTCGGGCCC 13 80 

CCGACGCCGC CCCATTTCCC TTCCAGCAAA CTCAACTCGG CAATCCAAGC ACCTAGATAC 1440 

CAGCACAAGT CGGTTAATCC CTGTCTGGAC TGAGCCTCCG TTGGCT7CTG AACTGGAATT 1500 

CTGCAGCTAA CCCTTCCACG ACTAGAACCT TAGGCATTGG GGAGTTTTAG ATC-GACTAAT 15 SO 
TTTATTAAAG GATTGTTTTT TTTTT 

Seq ID NO: 379 Protein sequence 
Protein Accession ft: NP_002096 

1 11 21 31 41 51 

I I I I I I 

MSGRGKTGGK ARAKAKSRSS RAGLQFPVGR VHRLLRKGHY AERVGAGAPV YLAAVLEYLT SO 

AEILELAGNA ARDNKKTRI I PRHLQLAIRN DEELNKLLGG VTIAQGGVLP NIQAVLLPKK 120 
TSATVGPKAP SGGKKATQAS QEY 

Seq ID NO: 3 80 DNA sequence 
Nucleic Acid Accession ft: AL136942 
Coding sequence: 184-864 

1 11 21 31 41 51 

I I I I I I 

ACGCGTCCGG CAGAAGCTCG GAGCTCTCGG GGTATCGAGG AGGCAGGCCC GCGGGCGCAC 60 

GGGCGAGCGG GCCGGGAGCC GGAGCGGCGG AGGAGCCGGC AGCAGCGGCG CGGCGGGCTC 120 

CAGGCGAGGC GGTCGACGCT CCTGAAAACT TGCGCGCGCG CTCGCGCCAC TGCGCCCGGA 180 

GCGATGAAGA TGGTCGCGCC CTGGACGCGG TTCTACTCCA ACAGCTGCTG CTTGTGCTGC 2 40 

CATGTCCGCA CCGGCACCAT CCTGCTCGGC GTCTGGTATC TGATCATCAA TGCTGTGGTA 3 00 

CTGTTGATTT TATTGAGTGC CCTGGCTGAT CCGGATCAGT ATAACTTTTC AAGTTCTGAA 3 SO 

CTGGGAGGTG ACTTTGAGTT CATGGATGAT GCCAACATGT GCATTGCCAT TGCGATTTCT 420 

CTTCTCATGA TCCTGATATG TGCTATGGCT ACTTACGGAG CGTACAAGCA ACGCGCAGCC 480 

TGGATCATCC CATTCTTCTG TTACCAGATC TTTGACTTTG CCCTGAACAT GTTGGTTGCA 540 

'TTATCC AAACTCCATT CAGGAATACA TACGGCAACT GCCTCCTAAT 600 

\ GAGATGATGT CATGTCAGTG AATCCTACCT GTTTGGTCCT TATTATTCTT 660 

_ A GCATTATCTT GACTTTTAAG GGTTACTTGA TTAGCTGTGT TTGGAACTGC 7 20 

TACCGATACA TCAATGGTAG GAACTCCTCT GATGTCCTGG TTTATGTTAC CAGCAATGAC 78 0 

ACTACGGTGC TGCTACCCCC GTATGATGAT GCCACTGTGA ATGGTGCTGC CAAGGAGCCA 840 

CCGCCACCTT ACGTGTCTGC CTAAGCCTTC AAGTGGGCGG AGCTGAGGGC AGCAGCTTGA 90 0 

CTTTGCAGAC ATCTGAGCAA TAGTTCTGTT ATTTCACTTT TGCCA7GAGC CTC7CTGAGC 960 

TTGTTTGTTG CTGAAATG CT ACTTTTTAAA ATTTAGATGT TAGATTGAAA ACTGTAGTTT 1020 

TCAACATATG CTTTGCTAGA ACACTGTGAT AGATTAACTG TAGAATTCTT CCTGTACGAT 1080 

TGGGGATATA ACGGGCTTCA CTAACCTTCC CTAGGCATTG AAACT7CCCC CAAATCTGAT 1140 

GGACCTAGAA GTCTGCTTTT GTACCTGCTG GGCCCCAAAG TTGGGCATTT TTCTCTCTGT 1200 

TCCCTCTCTT TTGAAAATGT AAAATAAAAC CAAAAATAGA CAACTTTTTC TTCAGCCATT 1260 

CCAGCATAGA GAACAAAACC TTATGGAAAC AGGAATGTCA ATTGTG1 ' 
ATTAGGTAAA TAGAAGTCCT TATGTATGTG TTACAAGAAT TTCCCCCACA A 
TGACTGAAGT TCAATGACAG TTTGTGTTTG GTGGTAAAGG ATTTTCI 

TAAGACCATT AGAAAGCACC AGGCCGTGGG AGCAGTGACC ATCTACTGAC TGTTCTTGTG 1500 

GATCTTGTGT CCAGGGACAT GGGGTGACAT GCCTCGTATG TGTTAGAGGG TGGAATGGAT 1560 

GTGTTTGGCG CTGCATGGGA TCTGGTGCCC CTCTTCTCCT GGATTCACAT CCCCACCCAG 162 0 

GGCCCGCTTT TACTAAGTGT TCTGCCCTAG ATTGGTTCAA GGAGGTCATC CAACTGACTT 1680 

TATCAAGTGG AATTGGGATA TATTTGATAT ACTTCTGCCT AACAACATGG AAAAGGGTTT 1740 

TCTTTTCCCT GCAAGCTACA TCCTACTGCT TTGAACTTCC AAGTATGTCT AGTCACCTTT 1800 

TAAAATGTAA ACATTTTCAG AAAAATGAGG ATTGCCTTCC TTGTATGCGC TTTTTACCTT 1860 

GACTACCTGA ATTGCAAGGG ATTTTTATAT ATTCATATGT TACAAAGTCA GCAACTCTCC 1920 

TGTTGGTTCA TTATTGAATG TGCTGTAAAT TAAGTCGTTT GCAATTAAAA CAAGGTTTGC 1980 
CCACATCCAA AAAAAAAAAA AAAAA 

Protein Accession #.■ 'cAB66876 

i I 1 T T 41 f 

MKMVAPWTRF YSNSCCLCCH VRTGTILLGV WYLIINAWL LILLSALADP DQYNFSSSEL 60 

GGDFEFMDDA NMCIAIAISL LMILICAMAT YGAYKQRAAW IIPFFCYQIF DFALNMLVAI 120 

TVLIYPNSIQ EYIRQLPPNF PYRDDVHSVN PTCLVLIILL FISI ILTFKG YLISCVWHCY 180 
D VLVYVTSNDT TVLLPPYDDA TVNGAAKEPP PPYVSA 



Coding sequence: 92-1774 

| I' I' I' I' I' 

CAGATGCCAG AAGAACACTG TTGCTCTTGG TGGACGGGCC CAGAGGAATT CAGAGTTAAA 

CCTTGAGTGC CTGCGTCCGT GAGAATTCAG CAT3CAA" T IAT1 T j 

TCIGCTCCTG GCTGCAAGAT TGCCACTTGA TGCCGCCAAA CGATTTCATG ATGTGC7GGG 

CAATGAAAGA CCTTCTGCTT ACATGAGGGA GCACAATCAA TTAAATGGCT GGTCITCTGA 

TGAAAATGAC TGGAATGAAA AACTCTACCC AGTGTGGAAG CGGGGAGACA TGAGGTGGAA 

AAACTCCTGG AAGGGAGGCC GTGTGCAGGC GGTCCTGACC AGTGACTCAC CAGCCCTCGT 
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C AACTGGACAG 
AAGCCATCAT AACGTCTTCC 
ATGGAATTTC ATCTACGTCT 
TTCAGTGAGA GTTTCTGTGA 
GACTGTCTAC AGAAGACATG 
CGTGGTAACA GATCAGATTC 



TAGCCACTTC CTCAATTATT 
CCTGTTTGTT TCCACCAATC 
CCTTAACCTC ACTGTGAAAG 
CAGACCTTCA AAACCCACCC 
T GATGAAAACT 



AGATGCCAAA 
AGAAGAACTG CAGAAATGAG GCTGGTTTAT 
CATGGTCAGA GGACAGTGAC GGGGAAAATG 
CTGATGGGAA ACCTTTTCCT CACCACCCCG 
TCCACACACT TGGTCAGTAT TTCCAGAAAT 
ACACAGCCAA TGTGACACTT GGGCCTCAAC 
GACGGGCATA TGTTCCCATC GCACAAGTGA 
CAGAAGAACG 

CTACAAGTGG AGCTTCGGGG 
ATACTGTGAA TCACACGTAT GTGCTCAATG 



AGGAAGATGC 
CTGCTGATCC 
GCACCGGCCA 
GATGGAGAAG 
TGGC-ACGATG 
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AAGATGTGTA 



TTCATGATCC 900 



GCCAGATTAA C 



GCCATGGCCT 
GGAGGTCTGT 
CCCTGTGGAT 
GACGTACTGT 
GATTTCTGTT 
CTCCGTTGGC 
CAAGGAATAC 
TGTCTTTCTC 
ACTCAAAAAC 
TTTTCAGTGC 
TTATTGTTAA 
TTTAGAGATG 
AAAGCAACTT 



30 
35 
40 



GTGAACCTCA 
CCTGACAGAG 
TGCTTGGCCA 
AACCCAATAG 
AACCGTGCAA 
CAAGAATTTA 
CATTGATGTG 
ATAGATATTG 



CTGACCCCAC 
TGTGTCTGCT 
CCCTGGGGGA 
ACCCAGCCTC 
TATTTGTCAC 
AAAATAGTCC 
AAGCCGTGTT 
AAGGAGTTTC 
AGATGTGCTG 



CTGCGAGATC 
GACTGTGAGA 
TGACACAAGC 
GCCTTTAAGG 
TGTGATCTCC 
TGGGAATGTG 



CCACCATCAC 
ACAGACGTCC TGATGCCGGT 
TGCCAAGGGA GCATTCCCAC 
ACCCAGAACA CAGTCTGCAG 
CGAACCTTCA ATGGGTCTGG 



AGCCTAACCC 
ATGCATAAAG 
TTCAATACAC 



TGACAACCTA 
GGACATTTAG 
ATTTCCAAAT 
AGATGAGGTC 
ACTAAAACCA 
AAGTGTGGGA 

Seq ID NO: 



AGCAAGGCTT 
GATAGAAACA 
AGGTTAACTG 
CCAATGTAGT 
ACTCATGAAC 
GCTAGACTCA 
CTTTGCTTGG 
TTAGTGCTTT 
TTTTGTATAG 



TTATACTGCA 
CTTTTCATTA 
CTGTGTCCCG 



TTAAATTTCG 
GAGTGGCTAT 
AAGTTGAATT 
GGCAGCTTCA 



CCAGTTTCTA 
TCCTGATGGA 
GAAAAAATAC 
CTGAGTGAAG 
TTATATACCA 
TCGCTGCACA 
CATGGCAACT 
GTTAGACATG 
AAAAAAAAA 

i sequence 



TTTTATAGGT 
GCCATGTTGT 
TTCACTTATA 
GAGAAGCTAC 
TCAGCTTTCC 
CCAAGCTAAC 
GCCCAAGCCT 



AGGATCCGCT 
TGAAGCTCAC 
TTCCTAAAGA 
TAAATGTCAT 
GAAACTGATA 
AAGTCTTAGG 
TATTGATTAG 
ATGTAACTGT 
TGAA7CCCAC 



AGAGTAAGGA 
CGGGATACTT 
AGATCATGTT 
ACAATAACAG 
TACTCTCATA 
GAATGATATT CATATATTCA TTTATTCCAT 
GGCATGATGC TGAGTGACAC TCTTGTGTAT 
TATTTGAAAT CATATATTAA GACTTTCCAA 
GGATTTCACC TCTG7TTGTA 
CTGAAAAATA 



ACATTCTTTT T 



2520 
2580 
2640 



VWKRGDMRWK N 

GQYFQKLGRC 
TMFQKNDRHS 
HTYVLNCTFS 
RYGHFQATIT 
CEITQNTVCS 
PLRMANSALI 
FPGNQEKDPL 



SDETFLKDLP 
LNLTVKAAAP 
IVEGILEVNI 
PVDVDEMCLL 
SVGCLAIFVT 
LKNQEFKGV S 



AAKRFHDVLG 
VLTSDSPAI.V 
DSDGENGTGQ 
VTLGPOLMEV 
IMFDVLIHDP 
GPCPPPPPPP 
IQMTDVLMPV 
TVRRTFNGSG 
VISLLVYKKH 



NERPSAYMRE H 



D ENDWNSKLY? 



SHHNVFPDGK 
TVYRRHGRAY 
SHFLNYSTIN 
RPSKPTPSLG 
PWPESSLIDF 
TYCVNLTLGD 



WTDQIFVFV 
LFVSTNHTVN 
RIPDENCQIN 
EVCTIISDPT 
I3VPDRDPAS 
VFLNRAKAVF 



Seq ID NO: 3 84 
Nucleic Acid Accession #: NM_001134 
Coding sequence; 48-1877 



TCCATATTGT 
AATCAATTTT 
AATATGGAAT 
ACCTGGCTAC 
AAATGGTGAA 
GGTGTTTAGA 
TGGAGAAGTA 
TTCTTGCACA 
TCACAAGCTG 



GCTTCCACCA 
TTTAATTTTC 
AGCTTCCATA 



CAAARAGCCC 
TGAAGCAT A T 
AAGGCATCCC 
AATTCCATCT 
AGTTACAAAA 
AAATTTTGGG 



I 

CTGCCAATAA 
CTACTAAATT 
TTGGATTCTT 
GCCCAGTTTG 
ACTGCAATTG 
CCTGCCTTTC 
GACTGCTGCA 
ACTCCAGCAT 
GAAGAAGACA 
TTCCTGTATG 
TGCTGCAAAG 



TTACTGAATC 
ACCAATGTAC 
TTCAAGAAGC 
AGAAACCCAC 
TGGAAGAACT 



\ CTGTTGCAGA 



TGACCACGCT 
GTCTATCTCC 
GGGAAAAAAA 
TTGCTGTCTC 
TCCAGACTGA 



AAATCTAAAC 
TATCTTCTTG 
AGTAATTCTA 
AAACCCTCTT 
CCAAGCATTG 



TTTACTGAAA 
GGAGATGTGC 
CAAGACACTC 
CAATGTATAA 
AGGTTTTTAG 
GCAAGTTTTG 
AGAGTTGCTA 
GAATGCCAAG 
GCAAAGCGAA 



CGATCCCACT 
GGGAGACATT 
CACCTACAAT 
CTGAAAATGC 
AAAGCAGCTT 
TCCAAGCCAT 



I 

" AGCAACCATG 
CAGAACACTG 
TGCAGAGATA 
CACITACAAG 
TGGAGATGAA 
TTGCCATGAG 
AGAGGGAAGA 
TTTCCAAGTT 



51 



TCTTCTTTGG 
AGTTGAATGC 
GTTAAATCAA 
AACTG7TACT 



I 

AAGTGGGTGG 
CATAGAAATG 
AGT7TAGCTG 
GAAGTAAGCA 
CAGTCTTCAG 
AAAGAAATTT 
CATAACTGTT 
CCAGAACCTG 
TTCATTTATG 
GCTGCTCGCT 
TTCCAAACAA 



CATGCATGTG 720 



TGTCAAACAA A 



h TGCTGCAAAG 960 



TTTTAACCAA T 
TTCAAGAAGA CATCCTCAGC 
GGAGTTATTG GAGAAGTGTT 
ATAAAGGAGA AGAAGAATTA CAGAAATACA 
GCTGCGGCCT CTTCCAGAAA CTAGGAGAAT 
ACACAAAGAA AGCCCCCCAG CTGACCTCGT 
CAGCCACAGC AGCCACTTGT TGCCftACTCA 
3 GAGCGGCTGA CATTATTATC GGACACTTAT 



328 



WO 02/086443 

GTATCAGACA TGAAATGACT CCAGTAAACC CTGGTGTTGG CCAGTGCTGC ACTTCTTCAT 
ATGCCAACAG GAGGCCATGC TTCAGCAGCT TGGTGGTGGA TGAAACATAT GTCCCTCCTG 
CATTCTCTGA TGACAAGTTC ATTTTCCATA AGGATCTGTG CCAAGCTCAG C-GTGTAGCC-C 
TGCAAACGAT GAAGCAAGAG TTTCTCATTA ACCTTGTGAA GCAAAAC-CCA CAAATAACAG 
AGGAACAACT TGAGGCTGTC ATTGCAGATT TCTCAGGCCT GTTGGAGAAA TGCTGCCAAG 
GCCAGGAACA GGAAGTCTGC TTTGCTGAAG AGGGACAAAA ACTGATTTCA AAAACTCGTG 
CTGCTTTGGG AGTTTAAATT ACTTCAGGGG AAGAGAAGAC AAAACGAGTC TTTCATTCGG 
TGTGAACTTT TCTCTTTAAT TTTAACTGAT TTAACACTTT TPGTGAATTA ATGAAATGAT 
A TATCTCCAAA TG 
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RHNCFLAHKK F 



TKLSQKPTKV 
ECCKLTTLER 
RHPQLAVSVI 
KLGEYYLQNA 
IGHLCIRHEM 
QGVALQTMKQ 
SKTRAALGV 



Seq ID NO: 386 DNA sequence 
Nucleic Acid Accession H: NM 
. .3149 



NFTEIQKLVL 
GQCI IHAEND 
LRVAKGYQEL 
FLVAYTKKAP 
TPVNPGVGQC 
EFLINLVKQK 



I I 
LHRNEYGIAS ILDSYQCTAE 
EQSSGCLENQ LPAFLEELCH 
VPEPVTSCEA YEEDRETFMN 
CFQTKAATVT KELRESSLLN QHACAVMKNF 
DVAHVHEHCC RGDVLDCLQD GEKIM3YICS 
EKPEGLSPNL NRFLGDRDFN QFSSGEKNIF 
LEKCFQTENP LECQDKGEEE LQKYIQESQA 
QLTSSELMAI TRKMAATAAT CCQLSEDKLL 
CTSSYANRRP CFSSLWDET YVPPAFSDDK 
PQITEEQLEA VIADFSGLLE KCCQGQEQEV 



GTRTFQAITV 24 0 

QQDTLSNKIT 300 

LASFVHEYSR 360 

LAKRSCGLFQ 42 0 

ACGEGAADI I 48 0 



55 
60 
65 
70 



CGCCGACCCC 
GGCTTCAACT 
GGATTCTCAG 
CCCAAGGCTA 
TGGGGTGCCA 



GGACGCCAGA 
CGCTSSTGCC 
TAGACGCGGA 
TGGAGTTTTA 
ATACCAGCCA 
GCCCCACACA 
CACTGTCCAG 
CAACAGTTCG 



CCGGCCGGGA 



GTGCACCCCC 



I I I 

C CACGCCGTGC AGCTGCGCTG GGGCCCCCGG 
G CTGCTSSTGC 
GTACTCTCGG 
ACAGACGGGG 
CTGCAGGGTG 
ATTGAATTTG 
GAGGAGCCTG 



GGACAGGGTT 
TTAGGTGGAC 
ATTGCAGAAT 



TTCAGTGGTG 
GGCTATGTCA 

CAGATGGCCT 



CCCGAATTCT 
ACTGCCAAGG 
CAGGAAGCTA 
CTTATTACCC 
GTTCCATCTA 
ATGACACAGA 
CCATCCTTAA 
CCTACTTTGG 
TGGTGGGGGC 
GGGTCTACGT 



GCCACTCACC 



AGGCTTCAGT 
TTTCTGGCAA 
CGAGTACCTG 



CCCTGCCGC7 
GGCCAGATCC 
TACCTAGGAT 



TCAGTGTGCT 
GTGCTGTCTA 
ACAGCAAAGG 
TGGAGTACAA 
TGGCATGCGC 
GCACCTGCTA 
CAGATTTCAG 
CCAAGACTGG 
TGTCTGCCAC 
TTCAGGGGCA 



CTCCTTCTTC 18 0 

GGTGGGAGCA 240 

CCTCTGTCCT 300 

CTCTCGGCTC 3 60 

GTCCTTGCAG 420 

TCCACTGTAC 480 

CCTCTCCACA 54 0 

CTGGGCAGCA 60 0 

CCGTGTGGTT 66 0 



TGGCTCAGAC 
ACCCCTGCTC 
TGAGTTTGGC 
ATTTCCTGGG 



ATTCGATCCC 



ACGTCAATGG 



CTCAGGGGAA 96 0 



GTGGACAAGG 
ATCTTCCCCG 
GCCTGCATCA 
GGTTTCACAG 
CTGTTCCTGG 
CGAGAGGATT 
CTCTCGCCGA 



GAGACCTGGA 
CTGTGGTATA 
CCATGTTCAA 
ACCTTAGCTT 
TGGAACTTCA 
CCTCCAGGCA 
GCAGAGAGAT 
TTCACATCGC 
GGCCAGCCCT 
ACTGTGGAGA 



CACCCAGCCG 
CGATTTGGCA GCTCCTTGAC 
GCCATCGGGG CTCCCTTTGG 
GGCCCAGGAG GGCTGGGCTC 
CACACCCCAG AC' 



CAGGGGCCGC 



CAGAATGTGG 
GCTGAGTACT 
TTTGCCGTGA 
GCCAGTCTGT 



GTGAGGGTGG 
CAGGACTCGT 
ACCAGAGCCG 
GGGGTGGCCT 



CTGCCTCAAT 
GCTGGACTGG 
GGCAACCCTG 
GAAGATCTAC 
TCTCAACTTC 
ACATTATCAG 
AGACAACATC 
TGGGTGAC 



TGA7TGTGGG G 
CCGCTAGTGC C 
GCTTAGAGGG G 



AGCCAGGGTG 
GTGACCAGAG 
GAGTTGGATC 



TGCTGGAACT 
TTACGGGACT 
CCGAGGGTTC 
CGGGACCTCA 
GGCCCCTGCA 
CTTTCTTGCA 



CGCCTATGAG 
CAGACACCCA 
CCTGCTGGTG 
TCGGTTTACA 
CCTCAGCAAG 
GGCTCAGGCC 
AAGCGACTGG 
CCATGTCTAT 
CAGCTGTCCC 
CAACTGCACC 
CCTGCACCAC 



CAGAAGCAGA 
ACCCAGACCC 
CTCAGGAACG 
TCCTTGGACC 
AGCAAGAGCC 
TGTGTGCCTG 
AAGAATGCCC 
GCTGAGCTTC 
GGGAACTTCT 



TGCTCATCCA 
AGTCAGAA7T 
CCCAAGCCCC 



ACGGCGGGCA 
GAATGGGGCT 
TCGAGACAAA 
AGTGGACAGC 
CAAGGCTCAG 
GGAAGTGTTT 
TTTCCATGCC 



AATCTCAACA 
CAGGTCACCC 
CATCCCCGAG 
GAGCTCATCA 
CAGGCTCTGG 
ACCAATCACC 
CAGCAAAAAC 



GCAACCCCAT 
TCCGGGACAC 
ACTCGCAAAG 
TGAACGGTGT 
ACCAGCCTCA 
ACCAAGGCCC 
AAGGTCAGCA 
CCATTAACCC 
GGGAAGCTCC 



CTGTGACTAC 
GAAGGCAGGA 
TAAGAAAACC 
CGACGTGGTT 
CTCCAAGCCT 
GAAGGAGGAG 
CAGCTCCATT 
GCTCC7ATAT 
AAAGGGCCTG 
AAGCCGCAGC 
CAGGCTGCGC 
TTTCCGAGTC 
TGAC-GCTGTG 



CAGGTGGCCA CAGC1GTGCA ATGGACCAAG GCAGAAGGCA GCTATGGCGT CCCACTGIGG 
ATCATCATCC TAGCCATCCT GTTTGGCCTC CTGCTCCTAG GTCTACTCAT CTACA7CCTC 
TACAAGCTTG GATTCTTCAA ACGCTCCCTC CCATATGGCA CCGCCATGGA AAAAGCTCAG 
CTCAAGCCTC CAGCCACCTC TGATGCCTGA 
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30 
35 
40 
45 
50 
55 
60 



EVGRVYVYLQ 
QQGWFVFPG 
VDKAWYRGR 
GFTVELQLDW 



HAVQLRWGPR 
TDGVSVLVGA PKANTSQPGV 
EEPVEYKSLQ WFGATVRAHG 
PCRSDFSWAA GQGYCQGGFS 
'INLVQGQLQT RQASSIYDDS 
QMASYFGYAV 



LQGGAVYLCP 
AEFTKTGRW 



IEFDSKGSRL 
DPVGICYL3T 

GQILSATQEQ 



GEQNHVYLGD 
FAVNQSRLLV 
SFRLSVEAQA 
SQGVLELSCP 
SASSGPQILK 
YKALKMPYRI 
YKLGFFKRSL 



QKQKGGVRRA 
SLDPQAPVDS 
KNALNLTFHA 



LFLASRQATL 
HGLRPALHYQ 
QNVGEGGAYE 



QVTLNGVSKP 
QALEGQQLLY 
CPEAECFRLR 
LPRQLPQKER 
PYGTAMEKAQ 



VTRVTGLNCT 
CELGPLHQQE 
QVATAVQWTK 
LKPPATSDA 



HPRDQPQKEE 
TNHPINPKGL 
SQSLQLHFRV 
AEGSYGVPLW 



MDRTPDGRPQ 
DLDQDGYNDV A 
RGGRDLDGNG Y 
ACINLSFCLN ASGKHVADSI 
REDCREMKIY LRNESEFRDK 
CVPDLQLEVF 
GNFSSLSCDY 
NL1INSQSDW 
ELINQGPSSI 
QQKREAPSRS 
QPFSLQCEAV 
LLLGLijIYIL 



Seq' ID NO: 388 DNA sequence 
Nucleic Acid Accession ft: NM 
Coding sequence: 26.. 1453 



AAAGAAGGTA 
AGTCTGCTCT 
TGCCCAGCAA 



AGGGCAGTGA 
GCCTATCCTC 
TACCTAGAAA 
AATCTCATTG 



TGAGAAAGCT 



GGTCACTTCA 
ATTGTGAATT 
CTGAAAGTCT 



GAATGATGCA 
TGAGTGGGGC 
AGTACTACAA 
TTAAAAAAAT 
CTGACACTCT 
GCTCCTTTCC 
ATACACCAGA 



I 

TCTTGCATTC 
AGCAAAAGAG 
CCTCGAAAAG 
CCAAGGAATG 
GGAGGTGATG 
TGGCATGCCG 



I 



I 



TGATGGCCCA 
TATTCACTTT 
CGTTGCTGCT 



TCTCTTTCGC 



GACTCCACTC 



TGATGTGAAT 
GGTGCCCACA 
GTCCTTCGAT 
TTGGCGAAGA 



GATGATGATG 
CCACTCTACA 



GAGGACICCA ACAAGGATCT 
GATGTGAAAC AGTTTAGAAG 
CAGAAGTTCC TTGGGTTGGA 
CGCAAGCCCA GGTGTGGAGT 
AAGTGGAGGA AAACCCACCT 
GATGCTGTTG ATTCTGCCAT 
ACATTCTCCA GGCTGTATGA 
CATGGAGACT TTTACTCTTT 
!AI < T Hi' TTTATGGAGA 
TCAGGCACCA ATTTAT7CCT 
ACACTGAAGC 



AAATCTGTTC 
TCCCACTGGA 



AGGCATCCAT A 



GCCTAAGGTT 
ACAGTTTGAG 
GTTACATTGC 
ATTATTCATC 
GAAGAAGATG 



CTTCGGGATC TGAGATGCCA GCCAAGTGTG 

ACCCTGAACC TGAAT7TCAT TTGATTTCTG 
ATGCTGCATA TGAAG7TAAC AGCAGGGACA 
GGGCCATCAG AGGAAATGAG GTACAAGCAG 
TTCCTCCAAC CATAAGGAAA 
ACTTCTTTGC AGCGGACAAA 
ACTAATAGCT 



TTTGACCCCA 
TAGGCGAGAT 
TAATGTATTA 
AGCCTTGCAG 
GAATTGCACT 
ATGTATTTTC ATAGATGTGT 



ATGCCAGGAT GGTGACACAC ATATTAAAGA GTAACAGCTG 
AGGGGGAAGA CAGATATGGG TGTTTTTAAT AAATCTAATA 
TGAGCCAAAA TGGTTAATTT TTCCTGCATG TTCTGTGACT 
ATATCTGCAT GTGTCATGAA GAATGTTTCT GGAATTCTTC 
GAACAGAATT AAGAAATACT CATGTGCAAT AGGTGAGAGA 
TATTACTTCC TCAATAAAAA GTTTTATTTT GGGCCTGTTC 



ft: NP_C02416 



I 



I 



I 



I 



I 



I 



MHLAFLVLLC LPVCSAYPLS GAAKEEDSNK DLAQQYLEKY YNLEKDVKQF RRKDSNLIVK 

KIQGMQKFLG LEVTGKLDTD TLEVMRKPRC GVPDVGHFSS FPGMPKWRKT HLTYRIVNYT 

PDLPRDAVDS AIEKALKVWE EVTPLTFSRL YEGEADIMIS FAVKEHGDFY SFDGPGHSLA 

HAYPPGPGLY GDIHFDDDEK WTEDASGTNL FLVAAHELGH SLGLFHSANT EALMYPLYNS 

FTELAQFRLS QDDVNGIQSL YGPPPASTEE PLVPTKSVPS GSEMPAKCDP ALSFDAISTL 

RGEYLFFKDR YFWRRSHWNP EPEFHLISAF WPSLPSYLDA AYEVNSRDTV FIFKGNEFWA 

I RGNEVQAGY PRGIHTLGFP PTIRKIDAAV SDKEKKKTYF FAADKYWRFD ENSQSMEQC-F 

PRLIADDFPG VEPKVDAVLQ AFGFFYFFSG SSQFEFDPNA RMVTHILKSN SWLHC 

Seq ID NO: 390 DNA sequence 

Nucleic Acid Accession it: NM_002421.2 

Coding sequence: 1..1409 



ATGCACAGCT TTCCTCCACT GCTGCTGCTG CTGTTCTGGG GTGTGGTGTC ACACAGCITC 
CCAGCGACTC TAG AAA C ACA AGAGCAAGAT GTGGACTTAG TCCAGAAATA CCTGGAAAAA 
TACTACAACC TGAAGAATGA TGGGAGGCAA GTTGAAAAGC « 
GTTGAAAAAT TGAAGCAAAT GCAGGAATTC TTTGGGCTGA AAGTGACTGG GAAACCAGAT 
GCTGAAACCC TGAAGGTGAT GAAGCAGCCC AGATGTGGAG TGCCTGA7GT GGCTCAGTTT 



TGGCCCAGTG 180 
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GTCCTCACTG AGGGGAACCC T 
3 ATTTGCCAAG A 



15 

20 
25 



3 CAAACACATC 



AGGCCCAGGT 
CCATTCTACT 



AAGGTCTCTG 
AACTCTCCTT 
ATTGGAGGGG 
AACTTACATC 
GATATCGGGG 
GATGACATTG 



I AACTACGATT 



TTGAGAAAGC 
AGGGTCAAGC 
TTGATGGACC 
ATGCTCATT? 
GTGTTGCGGC 
CTTTGATC-TA 
ATGGCATCCA 
CCCCAAAAGC 



CGGTTTTTCA 



TCTACAGCTC 
AACGATCTAT 



CTTTGGCTTC 



GAAGTTGAGC 
GTTCAGGGAC 



7TGCCGACAG 
TGAAGCATA? 



GATTGAAAAT 
CTTCCAACTC 
AGACATCATG 
TGGAGGAAAT 
TGATGAAGAT 
TCATGAACTC 
CCCTAGCTAC 
AGCCATATAT 
ATGTGACAGT 
TAAAGACAGA 
TTCTGTTTTC 
AGATGAAGTC 
ACACGGATAC 



TATCCCAAAA 
ACGAAGAGAA 



CTGGAGGTAT 



Seq ID NO: 392 DNA sequence 

Nucleic Acid Accession (t: NM_002421.2 

Coding sequence: 1..1409 



Protein Accessic 



MHSFPPLLLL LPWGWSHSF PATLETQEQD VDLVQKYLEK YYNLKNDGRQ V 
VEKLKQMQEF PGLKVTGKPD AETLKVMKQP RCGVPDVAQF VLTEGNPRWE QTHLTYRIEN 
YTPDLPRADV DHAIEKAFQL WSNVTPLTFT KVSEGQADIM ISFVRGDHRD NSPFDGPGGN 
LAHAFQPGPG IGGDAHFDED ERWTNNFREY NLHRVAAHEL GHSLGLSHST DIGALMYPSY 
TFSGDVQLAQ DDIDGIQAIY GRSQNPVQPI GPQTPKACDS KLTFDAITTI RGEVMFFKDR 
FYMRTNPFYP EVELNF I S VF WPQLPNGLEA AYEFADRDEV RFFKGNKYWA VQGQNVLHGY 
PKDIYSSFGF PRTVKHIDAA LSEENTGKTY FFVANKYWRY DEYKRSMDPG YPKMIAHDFP 
GIGHKVDAVF MKDGFFYFFH GTRQYKFDPK TKRILTLQKA NSWFNCRKN 



PCT/US02/12476 



45 
50 
55 
60 
65 
70 



AGAGCAAGAT GTGGACTTAG 



3 GTGTGGTGTC ACACAGCTTC 
TCCAGAAATA CCTGGAAAAA 
GGAGAAATAG TGGCCCAGTG 



GTCCTCACTG AGGGGAACCC 



CAAACACATC 



TGGAGTAATG 

CTTGCTCATG 
CAAAGGTGGA 



ACCTTCAGTG 
GGACGTTCCC 
AAGCTAACCT 
TTCTACATGC 
TGGCCACAAC 
CGGTTTTTCA 
CCCAAGGACA 



TCACACCTCT GACATTCACC AAGGTCTC 
TCAGGGGAGA TCATCGO 
CTTTTCAACC AGGCCCAGGT A 

CCAACAATTT CAGAGAGTAC AACTTACATC GTGTTG 
TTGGACTCTC CCATTCTACT GATATCGGGG CTTTGATGTA CCCTAGCTAC 
GATGACATTG A 



GATGAATATA 
GGAATTGGCC 
GGAACAAGAC 
AATAGCTGGT 



TTGATGCTAT AACTACGATT CGGGGAGAAG TGATGTTCTT TAAAGACAGA 
GCACAAATCC CTTCTACCCG GAAGTTGAGC TCAATTTCAT TTCTGTTTTC 
TGCCAAATGG GCTTGAAGCT GCTTACGAAT TTGCCGACAG AGATGAAGTC 
AAGGGAATAA GTACTGGGCT GTTCAGGGAC AGAATGTGCT ACACGGATAC 
TCTACAGCTC CTTTGGCTTC CCTAGAACTG TGAAGCATAT CGATGCTGCT 
AAAACACTGG AAAAACCTAC TTCTTTGTTG CTAACAAATA CTGGAGGTAT 
AACGATCTAT GGATCCAGGT TATCCCAAAA TGATAGCACA TGACTTTCCT 
ACAAAGTTGA TGCAGTTTTC ATGAAAGATG GATTTTTCTA TTTCTTTCAT 
AATACAAATT TGATCCTAAA ACGAAGAGAA TTTTGACTCT CCAGAAAGCT 
TCAACTGCAG GAAAAATTAG 



Seq ID NO: 393 P 



I 



I 



I 



L LFWGWSHSF PATLETQEQD VDLVQKYLEK YYKLKNDGRQ VEKRRNSGPV 

VEKLKQMQEF FGLKVTGKPD AETLKVMKQP RCGVPDVAQF VLTEGNPRWE QTHLTYRIEN 

YTPDLPRADV DHAIEKAFQL WSNVTPLTFT KVSEGQADIM ISFVRGDHRD NSPFDGPGGN 

LAHAFQPGPG IGGDAHFDED ERWTNNFREY NLHRVAAHAL GHSLGLSHST DIGALMYPSY 

TFSGDVQLAQ DDIDGIQAIY GRSQNPVQPI GPQTPKACDS KLTFDAITTI RGEVMFFKDR 

FYMRTNPFYP EVELNFISVF WPQLPNGLEA AYEFADRDEV RFFKGNKYWA VQGQNVLHC-Y 



3 ID NO : 394 DNA sequence 
;leic Acid Accession ft: : 
Sing sequence: 1..1506 
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TCTCTCCTAA 



ACCATCTGGA 
TTACCAGCTT 



GGGCAACAAG 
GAGGGGAGTC 
GGGCGTGCTC 



GAGCCACCTG 



CCTGAACTTG 
AGCATGAGTG 
GCAATTCTGA 
TTTAAAGACG 
TATGGAATGT 
AACCCTGAAA 
TATGTGCTGA 
AATGCAGTGG 
ATCTTTGTTG 
TTATTCTATG 
CGCAAGCACA 
TTCTCTGGAG 
GGGCTGGCAG 
TTCAAGGTGC 
CTTTCCCTCT 



TAAAGAAATC 
TTGTACGAGT 
TGGCATTTGG 



CTATTTGGAG 



ACGCTACATT 
CATTACAGCT 
CGCCCGGATC 
CCCTGGAGTT 



ATGCATATGC TGGCTGGTTT T 



CAAATGTGGC 
CAGTGACCTT 
CCCTCTCCTG 
TTGCGTCTCG 
CTCCTCTACC 



CTCCTCATAA 
CTGGAACCAT 
GTGGGCATAA 
CAGATTTTCT 
ATGCAGCTAA 
AGTATTACGC 

rGTATATCCA 



CTTACCTCCA 
GGCAGGAGAA 
TTGGCACCAT 
GCAGCGTGGG 
CTTTGTCTTA 
TTTTGGAAGT 
TACGCCCTGC 
TTTTTATTCA 
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TTCTGAGCGG 
CTTTGGCTCC 
AGAGGGTCAC 



TCAGAGAAAA 
TTATGAACTA 
TTTTTACTTC 
CAGTTATTTT 
AACTCTATGT 
GTCTCTGATA 
TCTCTACAAC 
TATATATGGG 
TTTTCAATTC 
ATTTTACATT 
GCTTTAATGG 
TTAAAGAAGA 
AAAAATCCTT 
TTATCTGTCA 
AGCAAGAGTT 
TACCCCTGAT 
TGAGAGAAAT 
CTACATGCAA 
TGAATTTTGA 
CTTCAGATGA 
AAGAAATGTC 
ATCTAGGCTT 
CTGATAAGAA 
GTTTTGCCAG 
GCACTTTGGG 
CAACATGGAG 
GCTGGTAATC 
GAGGTTGCAG 
CCATCTCCAA 



TTGCTGGGCT 
CACTGTTCAT 
ATTCGGACCC 
ATTATCTCTT 
TAACCAGAAC 
ATGGACTTGA 
ATTTTCTGAA 
TATTCATATA 
AGTTATAGAA 
CCTACCTATT 
ATATGTTAGC 
TTTTGTAAAG 
TGAAAAAAAG 
GACATTGCAT 
ATTATACCCA 
GTTTCTAGGG 



TCTTTTGAAT 
GATTTATCTT 
CCCAGCTTTG 
ATTTAGTACA 
TATTATATGG 



AGTTTGGTAT 



AACCAACAAA 
TGTTAGTAAT 
CAGTTTGTGC 
AACTGTCCAG 
GCTGTAAATA 
TGTCAGTAAT 



GGGGTTAGGA 
ACGGCAAAGA 
ATGGTTT1AC 
CATACATCAT 
TGCTTCCCCT 
GAGCACTTTG 
GCTACTGTTT 
TATGTCAGAT 
TCACATCAGT 
TAAATCCTCA 
AAACATATGC 



CTACTGGGAA 
ATGAACGGTG 
CTTCCAGAAA 

TTCCTCAGTT 
CGATACAAAT 
TTTTCCTTCA 
GGGATTGGCT 
GACAAGAAAC 
ATACTGGAAG 
ATCTGCCCAA 
ATTACAACTT 
TTCGAACTAA 
CAGTTATTCT 
GAAAAGACTA 



GA AGAAGTAGAA 
CTGAGGAGCT GCTGCTTTCA 



GCCCAGATAT 
CATGCCTCTT 
TCGTCATCAC 



TTGTACCAGA 
GGGGAGACAC 
TGGTGATAAA 
TTTCTAAGAA 
ATGAGTCGCA 
GACAATTACT 
TGAAGACTGA 
ACACTACAGA TGTCTATACT 
GATTATGGCA AAGAGGAGAG 
TTTAGATAAC 
AGTGGGGATT 
TCCAGGAGTT 
TCATTATCAG 
TTGATCAGGA AAGTGTATAA 
TTAGAACAAC 
ATTTTAAGCC 



GCTTTTTATT 
GCATCGTCCT 
CATGGTTGCC 
TCTGACTGGA 
TAGAATAATG 
AGAAGATAAG 
AAAATAGGGA 
CAAAAGGAGT 



CAATTCTTGA 
ATGTGGTCAT 
GATTTTTCTG 
GTGAAAAGTG 
AAAGAAATTT 
AAACACTCAT 
GTTGAATACA 
ATGTTTAAGT 
GAAGTTTTAG 



TCTGAAGTTT 



ATTAATTAGG 
AGATTTACAA 
TTCCACACCT 
ATGAGAATCT 
TATTAGAAAA TACTGTGAGC 
GG7GGATCAC 



^ TACATTTTAT A 



TGAGCCAAGA T 



CTGATGTT 
TAATTATCAT T 
GTGGATAAGT G 
CGGGCATGGT G 
CTGAGGTCGG G 
ATACAAAATT A 

GGCAGGAGAA TTGCTTGAAC CCGGGAGGCG 
GTACTCCAGC CTGGGTGACA AAGTCAGACT 312 0 



T TCAACTTGCA 
GTTTGTGTTC 
T GGCTTACATC 
3 GAGTTC7AGA 



7ACACGATGA 
TAAAATATCT 
AAAATTGCAA 
CCAC7TCTA7 
AAAGAGACAA 
AGAAGATGTT 
7G7AA7CCCA 
CCAGCCTGAC 
GGTGGCACAT 



2100 
2160 
2220 



2400 

2520 
2SB0 



J ID NO: 395 F 



I 

MVRKPWSTI 
GIPISPKGVL 
LPAFVRVWVE 
SMSVSWSARI 
YGMYAYAGWF 
NAVAVTFSER 



SKGGYLQGNV 



YLNFVTEEVE 
LLGNFSLAVP 
VLHPLTMIML 
FSFTCLFMVA 
ILEWPEEDK 



I 

NGRLPSLGNK 
TIWTVCGVLS 
VISLAFGRYI 
AILIIIVPGV 
NPEKTIPLAI 
IFVALSCFGS 
FSGDLDSLLN 
LSLYSDPFST 



31 

I 

EPPGQEKVQL 
LFGALSYAEL 
LEPFFIQCEI 
MQLIKGQTQN 
CISMAITIGV 
MNGGVFAVSR 



KRKVTLLRGV 
GTTIKKSGGH 
PELAIKLITA 
FKDAFSGRDS 
YVLTMVAYFT 



51 

I 

■ SIIIGTIIGA 
YTYILEVFGP 
VGITWMVLN 
SITRLPLAFY 
TINAEELLLS 
LPEILSMIHV 



GIGFVITLTG V 



C 7GCACCATGG 



ACGGACCCTG 
GCCAGTTCCT GTACGGGGGC 
GCGACGATGC TTGCTGGAGG 
TGGACGACCA GTGTGAGGGG 
GTGAAAAATT CTTTTCCGGT 
AAGCTACTTG TATGGGCTTC 
AAGATGAGGG ACTGTGCTCT 
CCTGTGATGC 



ACTACTACGA CAGGTACACC- 

ACGCCAACAA ITTCTACACC 

TTCCCAAAGT TTGCCGGCTG 

AGTATTTCTT TAATCTAAGT 

GGAACCGGAT TGAGAACAGG 
ATCATTTTGC 

GAGGGAATGA CAATAACTTT 



CCCC7AGAC7 
CAGAGCTGCC 
7GGGAG3CT7 
CAAG7GAGTG 



TTTCCAGATG 480 
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AGGATTGCAA ACGTGCATGT GCAAAAGCTT TGAAAAAGAA AAAGAAGATG CCAAAGCTTC 72 0 
GCTTTGCCAG TAGAATCCGG AAAATTCGGA AGAAGCAATT TTAAACATTC TTAATATGTC 780 
ATCTTGTTTG TCTTTATGGC TTATTTGCCT TTATGGTTGT ATCTGAAGAA TAATATGACA 84 0 
GCATGAGGAA ACAAATCATT GGTGATTTAT TCACCAGTT7 TTATTAATAC AAGTCACTTT 900 
TTCAAAAATT TGGATTTTTT TATATATAAC TAGCTGCTAT TCAAATGTGA GTCTACCATT 960 

TTTAATTTAT GGTTCAACTG TTTGTGAGAC GAATTCTTGC AATGCATAAG ATATAAAAGC 1020 

AAATATGACT CACTCATTTC TTGGGGTCGT ATTCCTGATT TCAGAAGAGG ATCATAACTG 1080 

A AGACAATATA ATCATGTGCT TTTAACATAT TTGAGAATAA AAAGGACTAG 114 0 



MDPARPLGLS I LLLFLTEAA LGDAAQEPTG NNAEICLLPL DYGPCRALLL RYYYDRYTQS 
CRQFLYGGCE GNANNFYTWE ACDDACWRIE KVPKVCRLQV SVDDQCEG3T EKYFPNLSSM 
TCEKPPSGGC HRNRIENRFP DEATCMGFCA PKKIPSFCYS PKDEGLCSAN VTRYYFNPRY 
RTCDAFTYTG CGGNDNNFVS RED CKRACAK ALKKKKKMPK LRFASRIRKI RKKQF 



I I I I I I 

ATGGCTTCAC CCAGCCTCCC GGGCAGTGAC TGCTCCCAAA TCATTGATCA CAGTCATGTC SO 

CCCGAGTTTG AGGTGGCCAC CTGGATCAAA ATCACCCTTA TTCTGGTGTA CCTGATCATC 12 0 

TTCGTGATGG GCCTTCTGGG GAACAGCGTC ACCATTCGGG TCACCCAG3T GCTGCAGAAG 180 

AAAGGATACT TGCAGAAGGA GGTGACAGAC CACATGGTGA GTTTC-GCTTG CTCGGACATC 24 0 

TTGGTGTTCC TCATCGGCAT GCCCATGGAG TTCTACAGCA TCATCTGGAA TCCCCTGACC 3 00 

ACGTCCAGCT ACACCCTGTC CTGCAAGCTG CACACTTTCC TCTTCGAG3C CTGCAGCTAC 3 60 

GCTACGCTGC TGCACGTGCT GACGCTCAGC TTTGAGCGCT ACATCGCCAT CTGTCACCCC 42 0 

TTCAGGTACA AGGCTGTGTC GGGACCTTGC CAGGTGAAGC TGCTGATTGG CTTCGTCTGG 4 80 

GTCACCTCCG CCCTGGTGGC ACTGCCCTTG CTGTTTGCCA TGGGTACTGA GTACCCCCTG 540 

GTGAACGTGC CCAGCCACCG GGGTCTCACT TGCAACGGCT CCAGCACCCG CCACCACGAG 600 

CAGCCCGAGA CCTCCAATAT GTCCATCTGT ACCAACCTCT CCAGCCGCTG GACCGTGTTC 660 

CAGTCCAGCA TCTTCGGCGC CTTCGTGGTC TACCTCGTGG TCCTGCTCTC CGTAGCCTTC 720 

ATGTGCTGGA ACATGATGCA GGTGCTCATG AAAAGCCAGA AGGGCTCGCT GGCCGGGGGC 7 80 

ACGCGGCCTC CGCAGCTGAG GAAGTCCGAG AGCGAAGAGA GCAGGACCGC CAGGAGGCAG 840 

ACCATCATCT TCCTGAGGCT GATTGTTGTG ACATTGGCCG TATGCTGGAT GCCCAACCAG 900 

ATTCGGAGGA TCATGGCTGC GGCCAAACCC AAGCACGACT GGACGAGGTC CTACTTCCGG 960 

GCGTACATGA TCCTCCTCCC CTTCTCGGAG ACGTTTTTCT ACCTCAGCTC GGTCATCAAC 1020 

CCGCTCCTGT ACACGGTGTC CTCGCAGCAG TTTCGGCGGG TGTTCGTGCA GGTGCTGTGC 10 80 

TGCCGCCTGT CGCTGCAGCA CGCCAACCAC GAGAAGCGCC TGCGCGTACA TGCGCACTCC 1140 

ACCACCGACA GCGCCCGCTT TGTGCAGCGC CCGTTGCTCT TCGCGTCCCG GCGCCAGTCC 1200 

TCTGCAAGGA GAACTGAGAA GATTTTCTTA AGCACTTTTC AGAGCGAGGC CGAGCCCCAG 12 60 

TCTAAGTCCC AGTCATTGAG TCTCGAGTCA CTAGAGCCCA ACTCAGGOGC GAAACCRGCC 1320 
AATTCTGCTG CAGAGAATGG TTTTCAGGAG CATGAAGTTT GA 

Seq ID NO : 399 Prot 



1 11 21 31 41 51 

I I I I I I 

MASPSLPGSD CSQIIDHSHV PEFEVATWIK ITLILVYLII FVMGLLGNSV TIRVTQVLQK 

KGYLQKEVTD HMVSLACSDI LVFLIGMPME FYSIIWNPLT TSSYTLSCKL HTFLFEACSY 

ATLLHVLTLS FERYIAICHP FRYKAVSGPC QVKLLIGFVW VTSALVALPL LFAMGTEYPL 

VHVPSHRGLT CNRSSTRHHE QPETSNMSIC TNLSSRWTVF QSSIFGAFW YLWLLSVAF 

MCWNMMQVLM KSQKGSLAGG TRPPQLRKSE SEESRTARRQ TIIFLRLIW TLAVCWMPNQ 

IRRIMAAAKP KHDWTRSYFR AYMILLPFSE TFFYLSSVIN PLLYTVSSQQ FRRVFVQVLC 

CRLSLQHANH EKRLRVHAHS TTDSARFVQR PLLFASRRQS SARRTEKIFL STFQSEAEPQ 

SKSQSLSLES LEPNSGAKPA NSAAENGFQE HEV 

Seq ID NO: 400 DNA sequence 



1 11 21 31 41 51 

I I I I I 

AACAGAACTG CAACGGAGAG ACTCAAGATG ATTCCCTTTT TACCCATC-TT TTCTCTACTA 60 

TTGCTGCTTA TTGTTAACCC TATAAACGCC AACAATCATT ATGACAAGAT CTTGGCTCAT 120 

AGTCGTATCA GGGGTCGGGA CCAAGGCCCA AATGTCTGT3 CCCTTCAACA GAT7TTGGGC 180 

ACCAAAAAGA AATACTTCAG CACTTGTAAG AACTGGTATA AAAAGTCCAT CTGTGGACAG 240 

AAAACGACTG TTTTATATGA ATGTTGCCCT GGTTATATGA GAATGGAAGG AATGAAAGGC 300 

TGCCCAGCAG TTTTGCCCAT TGACCATGTT TATGGCACTC TGGGCATCGT 3GGAGCCACC 360 

ACAACGCAGC GCTATTCTGA CGCCTCAAAA CTGAGGGAGG AGATCGAC-GG AAAGGGATCC 42 0 

TTCACTTACT TTGCACCGAG TAATGAGGCT TGGGACAACT TG3ATTCTGA TATCCGTAGA 480 

GGTTTGGAGA GCAACGTGAA TGTTGAATTA CTGAATGCTT TACA7AGTCA CATGATTAAT 540 

AAGAGAATGT TGACCAAGGA CTTAAAAAAT GGCATGATTA TTCCTTCAAT GTATAACAAT 600 

TTGGGGCTTT TCATTAACCA TTATCCTAAT GGGGTTGTCA CTC-T7AATTG TGCTCGAATC 66 0 

ATCCATGGGA ACCAGATTGC AACAAATGGT GTTGTCCATG TCATTGACCG TGTGCTTACA 72 0 

CAAATTGGTA CCTCAATTCA AGACTTCATT GAAGCAGAAG ATC-ACCTTTC ATCTTTTAGA 780 

GCAGCTGCCA TCACATCGGA CATATTGGAG GCCCTTGGAA GAC-ACGGTCA CTTCACACTC 84 0 

TTTGCTCCCA CCAATGAGGC TTTTGAGAAA CTTCCACGAG GTGTCCTAGA AAGGTTCATC- 90 0 

GGAGACAAAG TGGCTTCCGA AGCTCTTATG AAGTACCACA TCTTA.AATAC TC7CCAGTGT 960 

TCTGAGTCTA TTATGGGAGG AGCAGTCTTT GAGACGCTGG AAGGAAATAC AATTGAGATA 102 0 
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GGATGTGACG GTGACAGTAT AACAGTAAAT GGAATCAAAA TGGTGAACAA AAAGGATATT 1080 

\ ATGGTGTGAT CCATTTGATT GATCAGGTCC 7AATTCCTGA TTCTGCCAAA 1140 

G AGCTGGCTGG AAAACAGCAA ACCACCTTCA CGGATCTTG? 3GCCCAATTA 1200 

T CTGCTCTGAG GCCAGATGGA GAATACACTT TGCTGGCACC TGTGAATAAT 1260 

GCATTTTCTG ATGATACTCT CAGCATGGTT CAGCGCCTCC TTAAATTAAT TCTGCAGAAT 132 0 

CACATATTGA AAGTAAAAGT TGGCCTTAAT GAGCTTTACA AC3GGCAAA7 ACTGGAAACC 1380 

ATCGGAGGCA AACAGCTCAG AGTCTTCGTA TATCGTACAC- CT3TCTGCAT TGAAAATTCA 144 0 

TGCATGGAGA AAGGGAGTAA G CAAGGG AG A AACGGTGCGA TTCACATATT CCGCGAGATC 1500 

ATCAAGCCAG CAGAGAAATC CCTCCATGAA AAGTTAAAAC AAGATA^C-CG CTTTAGCACC 1560 

TTCCTCAGCC TACTTGAAGC TGCAGACTTG AAAGAGCTCC TGACACAACC TGGAGACTGG 162 0 

ACATTATTTG TGCCAACCAA TGATGCTTTT AAGGGAATGA CTAGTGAAGA AAAAGAAATT 1680 

CTGATACGGG ACAAAAATGC TCTTCAAAAC ATCATTCTTT ATCACCTGAC ACCAGGAGTT 1740 

TTCATTGGAA AAGGATTTGA ACCTGGTGTT ACTAACATTT TAAAGACCAC ACAAGGAAGC 1800 

AAAATCTTTC TGAAAGAAGT AAATGATACA CTTCTGGTGA AT3AATTGAA ATCAAAAGAA 1860 

TCTGACATCA TGACAACAAA TGGTGTAATT CATGTTGTAG ATAAACTCCT CTATCCAGCA 192 0 

GACACACCTG TTGGAAATGA TCAACTGCTG GAAATACTTA ATAAATTAAT CAAATACATC 1980 

CAAATTAAGT TTGTTCGTGG TAGCACCTTC AAAGAAATCC CC3TGACTGT CTATACAACT 2040 

AAAATTATAA CCAAAGTTGT GGAACCAAAA ATTAAAGTGA TTGAAGGCAG TCTTCAGCCT 2100 

ATTATCAAAA CTGAAGGACC CACACTAACA AAAGTCAAAA TT3AAGGTGA ACCTGAATTC 2160 

AGACTGATTA AAGAAGGTGA AACAATAACT GAAGTGATCC AT3GAGAGCC AATTATTAAA 222 0 

AAATACACCA AAATCATTGA TGGAGTGCCT GTGGAAATAA CTGAAAAAGA GACACGAGAA 2280 

GAACGAATCA TTACAGGTCC TGAAATAAAA TACACTAGGA TTTCTACTGG AGGTGGAGAA 2 34 0 

ACAGAAGAAA CTCTGAAGAA ATTGTTACAA GAAGAGGTCA CCAAGGTCAC CAAATTCATT 240 0 

GAAGGTGGTG ATGGTCATTT ATTTGAAGAT GAAGAAATTA AAAGACTGC? TCAG3GAGAC 2460 

ACACCCGTGA GGAAGTTGCA AGCCAACAAA AAAGTTCAAG GTTC7AGAAG ACGATTAAGG 252 0 

GAAGGTCGTT CTCAGTGAAA ATCCAAAAAC CAGAAAAAAA TGTTTATACA ACCCTAAGTC 2580 

AATAACCTGA CCTTAGAAAA TTGTGAGAGC CAAGTTGACT TCAGGAACTG AAACATCAGC 264 0 

ACAAAGAAGC AATCAT C AAA TAATTCTGAA CACAAATTTA ATATTTTTTT TTCT3AATGA 2700 

GAAACATGAG GGAAATTGTG GAGTTAGCCT CCTGTGGTAA AGGAATTGAA GAAAATATAA 2760 

CACCTTACAC CCTTTTTCAT CTTGACATTA AAAGTTCTGG CTAACTTTGG AATCCATTAG 282 0 

AGAAAAATCC TTGTCACCAG ATTCATTACA ATTCAAATCG AAGAGTTGTG AACTGTTATC 2880 

CCATT GAAAA GACCGAGCCT TGTATGTATG TTATGGATAC AT AAAATG C A CGCAAGCCAT 2 94 0 

TATCTCTCCA TGGGAAGCTA AGTTATAAAA ATAGGTGCTT GGTGTACAAA ACTTTTTATA 3000 

TCAAAAGGCT TTGCACATTT CTATATGAGT GGGTTTACTG GTAAATTATG TTATTTTTTA 3060 

CAACTAATTT TGTACTCTCA GAATGTTTGT CATATGCTTC TTGCAATGCA TATTTTTTAA 312 0 

TCTCAAACGT TTCAATAAAA CCATTTTTCA GATATAAAGA GAATTACTTC AAATTGAGTA 3180 
ATTCAGAAAA ACTCAAGATT TAAGTTAAAA AGTGGTTTGG ACTTGGGAA 

Seq ID NO: 401 Protein sequence 
Protein Accession #: NP_006466.l 

1 11 21 31 41 51 

1 I I I 1 I 

MIPFLPMFSL LLLLIVNPIN ANNHYDKILA HSRIRGRDQG PNVCALQQIL GTKXKYFSTC 60 

KNWYKKSICG QKTTVLYECC PGYMRMEGMK GCPAVLPIDH VYGTLGIVGA TTTQRYSDAS 12 0 

KLREEIEGKG SFTYFAPSNE AWDNLDSDIR RGLESNVNVE LLNALIISIIMI NKRMLTKDLK 180 

NGMIIPSMYN NLGLFINHYP NGVVTVNCAR IIHGNQIATN GWHVIDRVL TOIGTSIQDF 240 

IEAEDDLSSF RAAAITSDIL EALGRDGHFT LFAPTNEAFE KLPRGVLERF MGDKVASEAL 30 0 

MKYHILNTLO CSESIMGGAV FETLEGNTIE IGCDGDSITV NGIKMVNKKD IVTNNGVIHL 360 

IDQVLIPDSA KQVIELAGKQ QTTFTDLVAQ LGLASALRPD GEYTLLAPVN NAFSDDTLSM 42 0 

VQRLLKLILQ NKILKVKVGL NELYNGQILE TIGGKQLRVF VYRTAVCIEN SCMEKGSKQG 480 

RNGAIHIFRE IIKPAEKSLH EKLKQDKRFS TFLSLLEAAD LKELLTQPGD KTLFVPTNDA 540 

FKGMTSEEKE ILIRDKNALQ NIILYHLTPG VFIGKGFEPG VTNILKTTQG SKIFLKEVND 600 

TLLWELKSK ESDIMTTNGV IHWDKLLYP ADTPVGNDQL LEILNKLIKY IQIKFVRGST 660 

KIITKWEP KIKVIEGSLQ PIIKTEGPTL TKVKIEGEPE FRLIKEGETI 720 

KIYTKIIDGV PVEITEKETR EERIITGPEI KYTRISTGGG ETEETLKKLL 760 
IEGGDGHLFE DEEIKRLLQG DTPVRKLQAN KKVQGSRRRL REGRSQ 



TEVIHGEPII 



I I I I 

ATCCAATACA GGAGTGACTT GGAACTCCAT TCTATCACTA TGAAGAAAAG TGGTGT7CTT 60 

TTCCTCTTGG GCATCATCTT GCTGGTTCTG ATTGGAGTGC A=iC-GAACCCC AGTAGTGAGA 12 0 

AAGGGTCGCT GTTCCTGCAT CAGCACCAAC CAAGGGACTA TCCACCTACA ATCCTTGAAA 180 

GACCTTAAAC AATTTGCCCC AAGCCCTTCC TGCGAGAAAA TTGAAATCAT TGCTACACTG 24 0 

AAGAATGGAG TTCAAACATG TCTAAACCCA GATTCAGCAG ATGTGAAGGA ACTGATTAAA 300 

AAGTGGGAGA AACAGGTCAG CCAAAAGAAA AAGCAAAAGA ATGGGAAAAA ACATCAAAAA 360 

AAGAAAGTTC TGAAAGTTCG AAAATCTCAA CGTTCTCGTC AAAAGAAGAC TACATAAGAG 420 

ACCACTTCAC CAATAAGTAT TCTGTGTTAA AAATGTTCTA TTT7AATTAT ACCGCTATCA 480 

TTCCAAAGGA GGATGGCATA TAATACAAAG GCTTATTAAT TTGACTAGAA AATTTAAAAC 54 0 

ATTACTCTGA AATTGTAACT AAAGTTAGAA AGTTGATTTT AAGAATCCAA ACGTTAAGAA 600 

TTGTTAAAGG CTATGATTGT CTTTGTTCTT CTACCACCCA CCAGTTC-AAT TTCATCATGC 660 

TTAAGGCCAT GATTTTAGCA ATACCCATGT CTACACAGAT GTTCACCCAA CCACATCCCA 72 0 

CTCACAACAG CTGCCTGGAA GAGCAGCCCT AGGCTTCCAC GTACTGCAGC C7CCAGAGAG 780 

TATCTGAGGC ACATGTCAGC AAGTCCTAAG CCTGTTA3CA T3CTGGTGAG CCAAGCAGTT 84 0 

CTACAGGCCT CACACACAAT GTGTCTGAGA GATTCAT3CT GATTGTTATT GGGTATCACC 960 

ACTGGAGATC ACCAGTGTGT GGCTTTCAGA GCCTCCTTTC TGGCTTTGGA AGCCATGTGA 102 0 
TTCCATCTTG CCCGCTCAGG CTGACCACTT TATTTCTTTT TGTTCCC 
AAGTCAGCTC TTCTCCATCC TACCACAATG CAGTGCCTTT CTTCTCTCCA GTGC 
CATATGCTCT GATTTATCTG AGTCAACTCC TTTCTCATCT TGTCCCCSiAC A~~" 

AGTGCTTTCT TCTCCCAATT CATCCTCACT CAGTCCAGCT TAG7TCAAGT CCT3CCTCTT 1260 

AAATAAACCT TTTTGGACAC ACAAATTATC TTAAAACTCC TGTTTCACTT GGTTCAGTAC 1320 

CACATGGGTG AACACT CAAT GGTTAACTAA TTCTTGGGTG TTTATCCTAT CTCTCCAACC 1380 
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AGATTGTCAG CTCCTTGAGG GCAAGAGCCA CAGTATATTT CCCTGTTTCT TCCACAGTGC 1440 

Z TGTGGAACTA GGTTTTAATA ATTTTTTAAT TGATGTT3TT ATGGGCAGGA 1500 

3 ACCATTGTCT CAGAGCAGGT GCTGGCTCTT TCCTGGCTAC TCCATGTTGG 1560 

CTAGCCTCTG GTAACCTCTT ACTTATTATC TTCAGGACAC TCACTACAGG GACCAGGGAT 1620 

GATGCAACAT CCTTGTCTTT TTATGACAGG ATGTTTGCTC AGCTTCTCCA ACAATAAGAA 1680 

GCACGTGGTA AAACACTTGC GGATATTCTG GACTGTTTTT AAAAAATATA CAGTTTACCG 1740 

AAAATCATAT AATCTTACAA TGAAAAGGAC TTTATAGATC AGCCAGTGAC CAACCTTTTC 1800 

CCAACCATAC AAAAATTCCT TTTCCCGAAG GAAAAGGGCT TTCTCAATAA GCCTCAGCTT 1860 

TCTAAGATCT AACAAGATAG CCACCGAGAT CCTTATCGAA ACTCATTTTA GGCAAATATG 1920 

AGTTTTATTG TCCGTTTACT TGTTTCAGAG TTTGTATTGT GATTATC 

TCTCCCATGA AGAAAGGGAA CGGTGAAGTA CTAAGCGCTA GAGGAAGCAG C 

TAGTGGAAGC ATGATTGGTG CCCAGTTAGC CTCTGCAGGA TGTGGAAACC TCCTTCCAGG 

GGAGGTTCAG TGAATTGTGT AGGAGAGGTT GTCTGTGGCC AGAATTT 

CTTTCCCAAA TTGAATCACT GCTCACACTG CTGATGATTT AGAGTGCTGT C 

TCCCACCCGA ACGTCTTATC TAATCATGAA ACTCCCTAGT TCCTTCATGT AACTTCCCTG 2280 



GTAGACAGTA TATAACTAAC AACCAMGAC TACATATTGT CACTGACACA CACGTTA7AA 2 400 
TCATTTATCA TATATATACA TACATGCATA CACTCTCAAA GCAAATAATT TTTCACTTCA 2460 
CCTTGTAATT TGAAATATTT TCTTTGTTAA AATAGAATGG 2 520 



20 TATCAATAAA TAGACCATTA ATCAG 

Seq 
Prot 

25 i 



Bion #•. NP_002407 



MKKSGVLFLL GIILLVLIGV QGTPWRKGR CSCISTNQGT IHLQSLKDLK QFAPSPSCEK 
IEIIATLKNG VQTCLNPDSA DVKELIKKWE KQVSQKKKQK NGKKHQKKKV LKVRKSQRSR 

Seq ID NO; 404 DNA sequence 

Nucleic Acid Accession ft: NM_006670 

Coding sequence: 85.. 1347 



35 i 

1 I I I I I 

CCGGCTCGCG CCCTCCGGGC CCAGCCTCCC GAGCCTTCG3 AGCGGGCGCC GTCCCAGCCC 60 

AGCTCCGGGG AAACGCGAGC CGCGATGCCT GGGGGGTGCT CCCGGGGCCC CGCCGCCGGG 120 

GACGGGCGTC TGCGGCTGGC GCGACTAGCG CTGGTACTCC TGGGCTGGGT CTCCTCGTCT 180 

40 TCTCCCACCT CCTCGGCATC CTCCTTCTCC TCCTCGGCGC CGTTCCTGGC TTCCGCCGTG 240 

TCCGCCCAGC CCCCGCTGCC GGACCAGTGC CCCGCGCTGT GCGAGTGCTC CGAGGCAGCG 3 00 

CGCACAGTCA AGTGCGTTAA CCGCAATCTG ACCGAGGTGC CCACGGACCT GCCCGCCTAC 360 

GTGCGCAACC ICTTCCTTAC CGGCAACCAG CTGGCCGTGC TCCCTGCCGG CGCCTTCGCC 420 

CGCCGGCCGC CGCTGGCGGA GCTGGCCGCG CTCAACCTCA GCGGCAGCCG CCTGGACGAG 480 

45 GTGCGCGCGG GCGCCTTCGA GCATCTGCCC AGCCTGCGCC AGCTCGACCT CAGCCACAAC 540 

CCACTGGCCG ACCTCAGTCC CTTCGCTTTC TCGGGCAGCA ATGCCAGCGT CTCGGCCCCC 600 

AGTCCCCTTG TGGAACTGAT CCTGAACCAC ATCGTGCCCC CTGAAGATGA GCGGCAGAAC 66 0 

CGGAGCTTCG AGGGCATGGT GGTGGCGGCC CTGCTGGCGG GCCGTGCACT GCAGGGGCTC 72 0 

„ CGCCGCTTGG AGCTGGCCAG CAACCACTTC CTTTACCTGC CGCGGGATGT GCTGGCCCAA 780 

50 CTGCCCAGCC TCAGGCACCT GGACTTAAGT AATAATTCGC TGGTGAGCCT GACCTACGTG 840 

TCCTTCCGCA ACCTGACACA TCTAGAAAGC CTCCACCTGG AGGACAATGC CCTCAAGGTC 900 

CTTCACAATG GCACCCTGGC TGAGTTGCAA GGTCTACCCC ACATTAGGGT TTTCCTGGAC 960 

AACAATCCCT GGGTCTGCGA CTGCCACATG GCAGACATGG TGACCTGGCT CAAGGAAACA 1020 

GAGQTAGTGC AGGGCAAAGA CCGGCTCACC TGTGCATATC CGGAAAAAAT GAGGAATCGG 1080 

55 GTCCTCTTGG AACTCAACAG TGCTGACCTG GACTGTGACC CGATTCTTCC OCCATCCCTG 1140 

CAAACCTCTT ATGTCTTCCT GGGTATTGTT TTAGCCCTGA TAGGCGCTAT TTTCCTCCTG 12 00 

GTTTTGTATT TGAACCGCAA GGGGATAAAA AAGTGGATGC ATAACATCAG AGATGCCTGC 1260 

AGGGATCACA TGGAAGGGTA TCATTACAGA TATGAAATCA ATGCGGACCC CAGATTAACA 1320 

AACCTCAGTT CTAACTCGGA' TGTCTGAGAA ATATTAGAGG ACAGACCAAG GACAACTCTG 1380 

60 CATGAGATGT AGACTTAAGC TTTATCCCTA CTAGGCTTGC TCCACTTTCA TCC7CCACTA 1440 

TAGATACAAC GGACTTTGAC TAAAAGCAGT GAAGGGGATT TGCITCCITG TTATGTAAAG 1500 

TTTCTCGGTG TGTTCTGTTA ATGTAAGACG ATGAACAGTT GTGTATAGTG TTTTACCCTC 1560 

TTCTTTTTCT TGGAACTCCT CAACACGTAT GGAGGGATTT TTCAGGTITC AGCAT3AACA 1620 

TGGGCTTCTT GCTGTCTGTC TCTCTCTCAG TACAGTTCAA GGTGTAGCAA GTGTACCCAC 1680 

65 ACAGATAGCA TTCAACAAAA GCTGCCTCAA CTTTTTCGAG AAAAATACTT TATTCATAAA 1740 

T ATCAG TTTT ATTCTCATGT ACCTAAGTTG TGGAGAAAAT AATTGCATCC TATAAACTGC 1800 

CTGCAGACGT TAGCAGGCTC TTCAAAATAA CTCCATGGTG CACAGGAGCA CCTGCATCCA 1860 

AGAGCATGCT TACATTTTAC TGTTCTGCAT ATTACAAAAA ATAACTTGCA ACTTCATAAC 192 0 

TTCTTTGACA AAGTAAATTA CTTTTTTGAT TGCAGTTTAT ATGAAAATGT ACTGATTTTT 1980 

70 TTTTAATAAA CTGCATCGAG ATCCAACCGA CTGAATTGTT AAAAAAAAAA AAAAATAAAG 2 040 
ATTCTTAAAA GAA 



I I 1 I I I 

MPGGCSRGPA AGDGRLRLAR LALVLLGWVS SSSPTSSASS FSSSAPFLAS AVSAQPPLPD 
QCPALCECSE AARTVKCVNR NLTEVPTDLP AYVRNLFLTG NQLAVLPAGA FARRPPLASL 
AALNLSGSRL DEVRAGAFEH LPSLRQLDLS HNPLADLSPF AFSGSNASVS APSPLVELIL 
NHIVPPEDER QNRSFEGMW AALLAGRALQ GLRRLELASN HFLYLPRDVL AQLPSLRHLD 
LSNNSLVSLT YVSFRNLTHL ESLHLEDNAL KVLHNGTLAE LQ3LPHIRVF LDNNPWVCDC 
HMADMVTWLK ETEWQGKDR LTCAYPEKMR NRVLLELNSA DLDCDPILPP SLQT3YVFL3 

IVLALIGAIF LLVLYLNRKG IKKWMHNIRD ACRDHMEGYK YRYEINADPR LTNL3SNSDV 

Seq ID NO i 406 DNA sequenci 
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ATGCCTGGGG G 
CTAGCGCTGG T. 

TTCTCCTCCT CGGCGCCGTT CCTGGCTTCC GCCGTGTCCG CCCAGCCCCC GCTC-CCG3AC 
CAGTGCCCCG CGCTGTGCGA GTGCTCCGAG GCAGCGCGCA CAGTCAAGTG CGTTAACCGC 
AATCTGACCG AGGTGCCCAC GGACCTGCCC GCCTACGTGC GCAACCTCTT CCTTACCGGC 
AACCAGCTGG CCAGCAACCA CTTCCTTTAC CTGCCGCGGG ATGTGCTGGC CCAACTGCCC 
C ACCTGGACTT AAGTAATAAT TCGCTGGTGA GCCTGACCTA C 



CCCTGGGTCT GCGACTGCCA CATGGCAGAC ATGGTGACCT GGC7CAAGGA AACAGAGGTA 
GTGCAGGGCA AAGACCGGCT CACCTGTGCA TATCCGGAAA AAA7GAGGAA TCGC-GTCCTC 
3GACTGT GACCCGATTC TTCCCCCATC CCTGCAAACC 
T TGTTTTAGCC CTGATAG3CG CTA7TTTCCT CCTGGTTTTG 
T AAAAAAGTGG ATGCATAACA TCAGAGATGC CTGCAGGGAT 
k CAGATATGAA ATCAATGCGG ACCCCAGATT AACAAACCTC 
AGTTCTAACT CGGATGTCCT CGAGTGA 



Protein Accession #: Eos sequence 

1 11 21 31 41 51 

MPGGCSRGPA AGDGRLRLAR LALVLLGWVS SSSPTSSASS PSSSAPFLAS AVSAQPPLPD 
QCPALCECSE AARTVKCVNR NLTEVPTDLP AYVRNLFLTG NQLASNHFLY LPRDVLAQLP 
rHLESLH LEDNALKVLH NGTIAELQGL PHIRVFLDNN 



E INADPRLTNL 3 00 



21 31 41 51 

I I I 1 

G CTCCCCGCCA CCGCCATGGT CCCCGACACC GCCTGCGTTC TTCTGCTCAC 
CCTGGCTGCC CTCGGCGCGT CCGGACAGGG CCAGAGCCCG TTGGGCTCAG ACCTGGGCCC 
GCAGATGCTT CGGGAACTGC AGGAAACCAA CGCGGCGCTG CAGGACGTGC GG3ACTGGCT 
GCGGCAGCAG GTCAGGGAGA TCACGTTCCT GAAAAACACG GTGATGGAGT G 
CGGGATGCAG CAGTCAGTAC GCACCGGCCT ACCCAGCGTG CGGCCCCTGC T 

GCCCGGCTTC TGCTTCCCCG GCGTGGCCTG CATCCAGACG GAGAGCGGCG GCCGCTGCGG 360 

CCCCTGCCCC GCGGGCTTCA CGGGCAACGG CTCGCACTGC ACCGACGTCA ACGAGTGCAA 42 0 

CGCCCACCCC TGCTTCCCCC GAGTCCGCTG TATCAACACC AGCCCGGGGT TCCGCTGCGA 4 80 

GGCTTGCCCG CCGGGGTACA GCGGCCCCAC CCACCAGGGC GTGGGGCTGG CTTTCGCCAA 540 

GGCCAACAAG CAGGTTTGCA CGGACATCAA CGAGTGTGAG ACCGGGCAAC ATAACTGCGT 600 

CCCCAACTCC GTGTGCATCA ACACCCGGGG CTCCTTCCAG TGCGGCCCGT GCCAGCCCGG 650 

CTCGCCCAGC GAGTGCCACG AGCATGCAGA CTGCGTCCTA GAGCGCGATG GCTCGCGGTC 730 

GTGCGTGTGT CGCGTTGGCT GGGCCGGCAA CGGGATCCTC TGTGGTCGCG ACACTGACCT 84 0 

AGACGGCTTC CCGGACGAGA AGCTGCGCTG CCCGGAGCCG CAGTGCCGTA AGGACAACTG 900 

CGTGACTGTG CCCAACTCAG GGCAGGAGGA TGTGGACCGC GATGGCATCG GAGACGCCTG 960 

CGATCCGGAT GCCGACGGGG ACGGGGTCCC CAATGAAAAG GACAACTGCC CGCTGGTGCG 1020 

GAACCCAGAC CAGCGCAACA CGGACGAGGA CAAGTGGGGC GATGCGTGCG ACAACTGCCG 1080 

GTCCCAGAAG AACGACGACC AAAAGGACAC AGACCAGGAC GGCCGGGGCG AT3CGTGCGA 1140 

CGACGACATC GACGGCGACC GGATCCGCAA CCAGGCCGAC AACTGCCCTA GGGTACCCAA 12 00 

CTCAGACCAG AAGGACAGTG ATGGCGATGG TATAGGGGAT GCCTGTGACA ACTGTCCCCA 12 50 

GAAGAGCAAC CCGGATCAGG CGGATGTGGA CCACGACTTT GTGGGAGATG CTTGTGACAG 13 2 0 

CGATCAAGAC CAGGATGGAG ACGGACATCA GGACTCTCGG GACAACTGTC CCACGGTGCC 1380 

TAACAGTGCC CAGGAGGACT CAGACCACGA TGGCCAGGGT GATGCC7GCG ACGACGACGA 144 0 

CGACAATGAC GGAGTCCCTG ACAGTCGGGA CAACTGCCGC CTGGTGCCTA ACCCCGGCCA 1500 

GGAGGACGCG GACAGGGACG GCGTGGGCGA CGTGTGCCAG GACGACTTTG ATGCAGACAA 1560 

GGTGGTAGAC AAGATCGACG TGTGTCCGGA GAACGCTGAA GTCACGCTCA CCGACTTCAG 162 0 

GGCCTTCCAG ACAGTCGTGC TGGACCCGGA GGGTGACGCG CAGATTGACC CCAACTGGGT 1680 

GGTGCTCAAC CAGGGAAGGG AGATCGTGCA GACAATGAAC AGCGACCCAG GCCTGGCTGT 1740 

GGGTTACACT GCCTTCAATG GCGTGGACTT CGAGGGCACG TTCCATGTGA ACACGGTCAC 1800 

GGATGACGAC TATGCGGGCT TCATCTTTGG CTACCAGGAC AGCTCCAGCT TCTACGTGGT I860 

CATGTGGAAG CAGATGGAGC AAACGTATTG GCAGGCGAAC CCCTTCCGTC- CTGTGGCCGA 192 0 

GCCTGGCATC CAACTCAAGG CTGTGAAGTC TTCCACAGGC CCCGGGGAAC AGCTGCGGAA 1980 

CGCTCTGTGG CATACAGGAG ACACAGAGTC CCAGGTGCGG CTGCTGTGGA AGGACCCGCG 2 04 0 

AAACGTGGGT TGGAAGGACA AGAAGTCCTA 7CGT7GGTTC CTGCAGCACC GGCCCCAAGT 2100 

GGGCTACATC AGGGTGCGAT TCTATGAGGG CCCTGAGCTG GTGGCCGACA GCAACGTGGT 2160 

CTTGGACACA ACCATGCGGG GTGGCCGCCT GGGGGTCTTC TGCTTC7CCC AGGAGAACAT 2220 

CATCTGGGCC AACCTGCGTT ACCGCTGCAA TGACACCATC CCAGAGGACT ATGAGACCCA 2280 

TCAGCTGCGG CAAGCCTAGG GACCAGGGTG AGGACCCGCC GGATGACAGC CACCCTCACC 2340 

GCGGCTGGAT GGGGGCTCTG CACCCAGCCC AAGGGGTGGC CGTCCTGAGC- GGGAAGTGAG 24 00 
AAGGGCTCAG AGAGGACAAA ATAAAGTGTG T 

Seq ID NO: 409 Protein sequence 



MVPDTACVLL LTLAALGASG QGQSPLGSDL GPQMLRELQE TNAALQDVRD WLRQQVREIT 
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FLKNTVMECD ACGMQQSVRT GLPSVRPLLH CAPGFCPPGV ACIQTESGGR CGPCPAGFTG 120 

NGSHCTDVNE CNAHPCFPRV RCINTSPGFR CEACPPGYSG PTHQGVGLAF AKANKQVCTD 180 

INECETGQHN CVPNSVCINT RGSFQCGPCQ PGFVGDQASG CQRGAQRFCP DGSPSECHEH 240 

ADCVLERDGS RSCVCRVGWA GNGILCGRDT DLDGFPDEKL RCPEPQCRKD NCVTVPNSGQ 300 

EDVDRDGIGD ACDPDADGDG VPNEKDNCPL VRNPDQRNTD EDKWGDACD.M CRSQKNDDQK 360 

DTDQDGRGDA CDDDIDGDRI RNQADNCPRV PNSDQKDSDG DGIGDACDNC PQKSNPDQAD 420 

VDHDFVGDAC DSDQDQDGDG HQDSRDNCPT VPNSAQEDSD HDGQGDACDD DDDNDGVPDS 480 

RDNCRLVPNP GQEDADRDGV GDVCQDDFDA DKWDKIDVC PENABVTLTD FRAFQTWLD 54 0 

PEGDAQIDPN WWLNQGREI VQTMNSDPGL AVGYTAFNGV DFEGTFHVNT VTDDDYAGFI 600 

FGYQDSSSFY WMWKQMEQT YWQANPFRAV AEPGIQLKAV KSSTGPGEQL RNALWHTGDT 660 

ESQVRLLKKD PRNVGWKDKK SYRWFLQHRP QVGYIRVRFY EGPELVADSN WLDTTMRGG 72 0 

RLGVFCFSQE NIIWANLRYR CNDTIPEDYE THQLRQA 

Seq ID NO: 410 DNA sequence 

Nucleic Acid Accession II : NM_001565.1 

Coding sequence: 67.. 363 

i i 1 r i 1 r i 1 

TCAATTGCTT AG AC AT ATT C TGAGCCTACA GCAGAGGAAC 



ATTCAAGGAG TACCTCTCTC TAGAACCGTA CGCTGTACCT G' 
CCTGTTAATC CAAGGTCTTT AGAAAAACTT GAAATTATTC CTGCAAGCCA A 
CGTGTTGAGA TCATTGCTAC AATGAAAAAG AAGGGTGAGA AGAGATGTCT GAATCCAGAA 
TCGAAGGCCA TCAAGAATTT ACTGAAAGCA GTTAGCAAGG AAATGTC-AA AAGATCTCCT 
TAAAACCAGA GGGGAGCAAA AT CG ATGCAG TGCTTCCAAG GATGGACCAC ACAGAGGCTG 
CCTCTCCCAT CACTTCCCTA CATGGAGTAT ATGTCAAGCC A 
GTTACACTAA AAGGTGACCA ATGATGGTCA CCAAATCAGC TGCTACTACT C 
GGTTAATGTT CATCATCCTA AGCTATTCAG TAATAACTCT ACCCTGGCAC TATAATGTAA 600 
GCTCTACTGA GGTGCTATGT TCTTAGTGGA TGTTCTGACC CTGCTTCAAA TATTTCCCTC 660 
ACCTTTCCCA TCTTCCAAGG GTACTAAGGA ATCTTTCTGC TTTGGGGTTT ATCAGAATTC 720 
TCAGAATCTC AAATAACTAA AAGGTATGCA ATCAAATCTG CTTTTTAAAG AATGCTCTTT 780 
ACTTCATGGA CTTCCACTGC CATCCTCCCA AGGGGCCCAA ATTCTTTCAG TGGCTACCTA 84 0 
CATACAATTC CAAACACATA CAGGAAGGTA GAAATATCTG AAAATGTATG TGTAAGTATT 900 
CTTATTTAAT GAAAGACTGT ACAAAGTATA AGTCTTAGAT GTATATATTT CCTATATTGT 96 0 
TTTCAGTGTA CATGGAATAA CATGTAATTA AGTACTATGT ATCAATGAGT AACAGGAAAA 102 0 
TTTTAAAAAT ACAGATAGAT ATATGCTCTG CATGTTACAT AAGATAAATG TGCTGAATGG 1080 
TTTTCAAATA AAAATGAGGT ACTCTCCTGG AAATATTAAG 

Seq I 



Seq ID 

Nucleic Acid 
Coding sequs 



.2 DNA sequence 
Acceaaion II: XM 



GGGAGGGAGA GAGGCGCGCG GGTGAAAGGC GCATTGATGC AGCCTGCGGC GGCCTCGGAG 60 

CGCGGCGGAG CCAGACGCTG ACCACGTTCC TCTCCTCGGT CTCCTCCGCC TCCAGCTCCG 120 

CGCTGCCCGG CAGCCGGGAG CCATGCGACC CCAGGGCCCC GCCGCCTCCC CGCAGCGGCT 180 

CCGCGGCCTC CTGCTGCTCC TGCTGCTGCA GCTGCCCGCG CCGTCGAGCG CCTCTGAGAT 24 0 

CCCCAAGGGG AAGCAAAAGG CGCAGCTCCG GCAGAGGGAG GTGGTGGACC TGTATAATGG 300 

AATGTGCTTA CAAGGGCCAG CAGGAGTGCC TGGTCGAGAC GGGAGCCCTG GGGCCAATGG 360 

CATTCCGGGT ACACCTGGGA TCCCAGGTCG GGATGGATTC AAAGGAGAAA AGGGGGAATG 420 

TCTGAGGGAA AGCTTTGAGG AGTCCTGGAC ACCCAACTAC AAGCAGTGTT CATGGAGTTC 480 

ATTGAATTAT GGCATAGATC TTGGGAAAAT TGCGGAGTGT ACATTTACAA AGATGCGTTC 54 0 

AAATAGTGCT CTAAGAGTTT TGTTCAGTGG CTCACTTCGG CTAAAATGCA GAAATGCATG 600 

CTGTCAGCGT TGGTATTTCA CATTCAATGG AGCTGAATGT TCAGGACCTC TTCCCATTGA 660 

AGCTATAATT TATTTGGACC AAGGAAGCCC TGAAATGAAT TCAACAATTA ATATTCATCG 720 

CACTTCTTCT GTGGAAGGAC TTTGTGAAGG AATTGGTGCT GGATTAGTG3 ATGTTGCTAT 780 

CTGGGTTGGC ACTTGTTCAG ATTACCCAAA AGGAGATGCT TCTACTGGAT GGAATTCAGT 84 0 

TTCTCGCATC ATTATTGAAG AACTACCAAA ATAAATGCTT TAATTTTCAT TT3CTACCTC 900 

TTTTTTTATT ATGCCTTGGA ATGGTTCACT TAAATGACAT TTTAAATAAG TTTATGTATA 960 

CATCTGAATG AAAAGCAAAG CTAAATATGT TTACAGACCA AAGTGTGATT TCACACTGTT 1020 

TTTAAATCTA GCATTATTCA TTTTGCTTCA ATCAAAAGTG 3TTTCAATAT TTTTTTTAGT 1080 

TGGTTAGAAT ACTTTCTTCA TAGTCACATT CTCTCAACCT ATAATTTGGA ATATTGTTGT 114 0 

GGTCTTTTGT TTTTTCTCTT AGTATAGCAT TTTTAAAAAA ATATAAAAGC TACCAATCTT 1200 

TGTACAATTT GTAAATGTTA AGAATTTTTT TTATATCTGT TAAATAAAAA TTATTTCCAA 1260 



MRPQGPAASP QRLRGLLLLL LLQLPAPSSA SEIPKGKQKA QLRQREWDL YNGMCLQGPA 
GVPGRDGSPG ANGIPGTPGI PGRDGFKGEK GECLR3SFEE SWTPNYKQCS WSSLNYGIDL 
GKIAECTFTK MRSNSALRVL FSGSLRLKCR NACCQRWYFT FNGAECSGPL PIEAI IYLDQ 
GSPEMNSTIN IHRTSSVEGL CEGIGAGLVD VAIWVGTCSD YPKGDASTGK NSVSRIIIEE 
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Seq ID NOr 414 DNA sequence 
Nucleic Acid Accession tt: XM 
: 138.. 2405 
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ATTCGGCACG AGACCGCGTG T7CGCGCCTG GTAGAGATTT C 



51 
I 

- CTCGAAGACA 

CGTGTGGAAC CAAACCTGCG CGCGTGGCCG GGCCGTC-GGA CAACGAGGCC 
ICCTTTGCCC 
ACCACTGAC-A 
ACACGGCAAT 



AGGCGCAATG 
AAATCCCCTT 
GAATTGGGAA 
ACAGCTTTTC 
ACTTCAAAAT 
TCACTCAGAC 
AGACCACGAG 
TAAAAATAAG 
TAG AAACAG C 
CAAGGACAGT 



GCGAGGAAGT TATCTGTAAT 



TCTGGCATTA ATGTTGACTT 



CGAAAAGCTC T 



AAGCAGCTCC 



TGAAAATCCT 
TGATGCTAGA 



ACTCCACCCA 
CAGGAGTGTT 



TAGAGACTCC 
GTGAGCCCCG 



CTTGATCC7G 
TTTCCCCCAG 
GGCAATTTCC 
TTCTTTGTCA 
AAGAATCCAT 
TGAGCGTCAC 
TCACTCCCAC 
CCATGACTCA 
ACCAGAACAT 
CTCAACTGTG 



GTTGAAGGGT 3 60 

ATACACCATG 42 0 

TCAGACCATG 480 

CATAATCATG 540 

GATAGTTCAG €00 

GCCAGTGGTA 660 

TACAACACTG 720 



GTGAGCCGGC 



CAGAGTTCAA C 



GTCTCTGCTG 



AGAAGAAACC 
CTCAACTTTC 
GAGCAGACTC 
AAGAGGTCAT 
GGTGCAAGAA 
TTCACCACCA 
CTCACAGTCA 
TGGCCTGGAT 




TCCACATTCT 
GAAAAGAGGA 
TGATTCCACG 
ACATGTCCTC 
TGAAAATGAT 
AACAAATGAG 
ACAAGAGCCC 
GATAGCTCAT 
TAAATGCCAT 
TCATGACTAC 
CAGCCAGCGC 
GGTGATAATG 
TA( T AAG31 
TCATGAATTA 
CCTTTATAAT 
TGGTCATTAT 
GTATGTTGCT 
ATGTAGCCSC 
TATGTTACTT 
GTTTAAATGC 
GTTTGTATGC 



CAAATAGCCT 
GGGGTTATCT 
GTGGCACTGG 
CATGCAAGTC 
CCACTTTTCA 
TGGAAGGGTC 
ACATTGATCA 
GATGATGTGG 
GAGAAAGTAG 
TCCCACTTTG 
GCTCATCCAC 
TCACATTTCC 
CATCATATTC 
TACTCTCGGG 



GGGTTGGTGG 
TAGTGCCTCT 
CCGTTGGGAC 



GTCATCTGTC 
TAACAGCTCT 
AACAATTTAA 
AGATTAAGAA 



TTGTATTGAA 
TATTCTATCT 
TAAACAAGAG 
TTTTCAAGAA 
TGTTTAGGAA 
AGCAAAGAAA 
AAAAATCACA 
CAGAATTAGT 
GTAGTGAGCA CTCTCATATA 
AAATATATTT AATGAATTCA 
G TTATATACCA 



TGGAGATAAA 
ATTTGGCATG 
CTAACACAGT 
TAAGAATGTG 
TAAAGGAGAA 
AAATTTGTTG 



TTATCAAGTG 
GGTGACTTTG 
GCATTGTCAG 
GCTGAAAATG 
CTGGTTGATA 
TGGGGGTATT 
ATTTCCATAT 
TAGAGTAGCT 
TGTACTATGC 
T3TTACAAAG 



ATTCTCAGCA 
AGGAAGTCTA 
ACGATACACT 
TCCATCATCA 
AGGAGCTGAA 
TGCACAATTT 
GTTTAAGTAC 
CTGTTCTACT 
CCATGCTGGC 
TTTCTATGTG 
TGGTACCTGA 
TCTTTTTACA 
TTGAACATAA 



CATGAATCGG 
TTTGAGTGGT 
TCATAGCCAT 
TTCTCAAAAC 
AGGAGGCCTG 
AGATAAGAAG 
GCAGTTGTCC 
TCGAACTGAA 



CCAGCCATCA 
GCTGAAATCC 
ATTTCCATCA 



GATGCTTTTT 
GAAGAACCAG 
ATAGAAGAAA 
TATTTCATGT 
AAAAAGAATC 
AAGTATGAAT 
GGCTATTTAC 



CAATGAATAT 
CGGCCAGTCA 
CCACCACCAA 
AGATGCCGGC 
CAGCGATGGC 



AAAGGCTGGC 
GTATCTTGGA 
GATATTTGCA 
AATGCTGCAC 
GAATGCTGGG 
AATCGTGTTT 



GTACCCAGAG 
GACGATCTCA 
AACCACCATC 
GTCGCCACTT 
CTAGCAATTG 
GTGTTCTGTC 
ATGACCGTTA 
ATGGCAACAG 
CTTACTGCTG 
AATGATGCTA 
ATGCTTTTGG 
CGTATAAATT 



A GTTAGTGGGT T 
G GTACGTTTTA A 
C GGTATTACCA G 



3 GGAAAAATGT C 



AAGAGAAGAA 



Z AGAGTAGTAA 



GAGCAATTGT 



TTTTACACAA 



iGGGAGGCAT 

'AGAATTAAG TATAAAAAGG 
ATTTTTGTCA GGATTATTTC CCGTAAAAAC 
TACATTTAAC TTTGTATAAT ACAGAAATCT 
AGCAATATAC ACTTGACCAA GAAATTGGAA TTTCAAAATG 
GATGAGTACA GTGAGTAGTT TATGTATCAC CAGACTGGGT 
CCAAAAGCTG TATGACTGGA TGTTCTGGTT ACCTGGTTTA 
AACTTTGATA TATATGAGGA TATTAAAACT ACACTAAGTA 
AGTACTTTGA TATCTCTCAG TGCTTCAGTG CTATCATTGT 
GGTACTGTAG CCATACTAGG CCTGTCTGTG GCATTCTCTA 
TAAATTCCTT ATATCAGCTT G 

i sequence 



2040 
2100 
2160 
2220 

2 2 4 0 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 



3300 
3360 
3420 



70 
75 



I 



SVSASEVTST 



RSCLIHTSEK 
LVALAVGTLS 
TWKGLTALGG 
EEKVDTDDRT 
HSHFHDTLGQ 
MGDGLHNFSD 
NALSAMLAYL 
RWGYFFLQNA 



LTFALSVTNP 
SVEGFRKLLQ 
HHNHAASGKN 
VYNTVSEGTH 
FMYSRNTNEN 
KAEIPPKTYS 
GDAFLHLLPH 
LYFMFLVEHV 
EGYLRADSQE 
SDDLIHHHHD 
GLAIGAAFTE 

GMLLGFGIML 



LHELKAAAFP QTTEKISPNW SSGINVDLAI STRQYHLQQL 



KRKALCPDHD SDSSGKDPRN SQGKGAHRPE HASGRRNVKD 
FLETIETPRP GKLFPKDVSS STPPSVTSK3 RVSRLAGRKT 
PQECFNASKL LTSHGMGIQV PLNATEFNYL CPAI INQIDA 
LQIAWVGGFI AISIISFLSL LGVILVPLMN RVFFKFLLSF 
SHASHHHSHS HEEPAHEMKR GPLFSHLSSQ NI3ESAYFDS 
LTLIKQFKDK KKKNQKKPEN DDDVEIKKQL SKYESQLSTN 
PSHFDSQQPA VLEEEEVMIA HAHPQEVYNE YVPRGCXNKC 
YHHILHHHHH QNHHPHSHSQ RYSREELKDA GVATLAWMVI 
GLSSGLSTSV AVFCHELPHE LGDFAVLLKA GMTVKQAVLY 
YAENVSMWI F ALTAGLFMYV A 
LISIFEHKIV FRINF 



seqaen 
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ATGCCCAAGC GCGCGCACTG GGGGGCCCTC TCCGTGGTGC 7GATCCTGCT TTGGGGCCAT 60 

CCGCGAGTGG CGCTGGCCTG CCCGCATCCT TGTGCCTGCT ACGTCCCCAG CGAGGTCCAC 120 

TGCACGTTCC GATCCCTGGC TTCCGTGCCC GCTGGCATTG CTAGACACGT GGAAAGAATC 180 

AATTTGGGGT TTAAT AG CAT ACAGGCCCTG TCAGAAACCT CATTTGCAGG AC7GACCAAG 24 0 

TTGGAGCTAC TTATGATTCA CGGCAATGAG ATCCCAAGCA TCCCCGATGG AGCTTTAAGA 300 

GACCTCAGCT CTCTTCAGGT TTTCAAGTTC AGCTACAACA AGCTGAGAGT GATCACAGGA 360 

CAGACCCTCC AGGGTCTCTC TAACTTAATG AGGCTGCACA TTGACCACAA CAAGATCGAG 42 0 

I CTCAAGCTTT CAACGGCTTA ACGTCTCTGA GGCTACTCCA TT7GGAAGGA 480 

GCTGCA CCCCAGCACC TTCTCCACGT TCACATTTTT GGATTATTTC 54 0 

A CCATAAGGCA CCTCTACTTA GCAGAGAACA TGGTTAGAAC TCTTCCTGCC 60 0 

AGCATGCTTC GGAACATGCC GCTTCTGGAG AATCTTTACT TGCAGGGAAA TCCGTGGACC 660 

TGCGATTGTG AGATGAGATG GTTTTTGGAA TGGGATGCAA AATCCAGAGG AATTCTGAAG 720 

TGTAAAAAGG ACAAAGCTTA TGAAGGCGGT CAGTTGTGTG CAATGTGCTT CAGTCCAAAG 780 

AAGTTGTACA AACATGAGAT ACACAAGCTG AAGGACATGA CTTGTCTGAA GCCTTCAATA 84 0 

GAGTCCCCTC TGAGACAGAA CAGGAGCAGG AGTATTGAGG AGGAGCAAGA ACAG3AAGAG 900 

GATGGTGGCA GCCAGCTCAT CCTGGAGAAA TTCCAACTGC CCCAGTGGAG CATCTCTTTG 960 

AATATGACCG ACGAGCACGG GAACATGGTG AACTTGGTCT GTGACATCAA GAAACCAATG 102 0 

GATGTGTACA AGATTCACTT GAACCAAACG GATCCTCCAG ATATTGACAT AAATGCAACA 1080 

GTTGCCTTGG ACTTTGAGTG TCCAATGACC CGAGAAAACT ATGAAAAGCT ATGGAAATTG 114 0 

ATAGCATACT ACAGTGAAGT TCCCGTGAAG CTACACAGAG AGCTCATGCT CAGCAAAGAC 1200 

CCCAGAGTCA GCTACCAGTA CAGGCAGGAT GCTGATGAGG AAGCTCTTTA CTACACAGGT 1260 

GTGAGAGCCC AGATTCTTGC AGAACCAGAA TGGGTCATGC AGCCATCCAT AGATATCCAG 1320 

CTGAACCGAC GT CAGAGT AC GGCCAAGAAG GTGCTACTTT CCTACTACAC CCAGTATTCT 13 80 

CAAACAATAT CCACCAAAGA TACAAGGCAG GCTCGGGGCA GAAGCTGGGT AATGATTGAG 144 0 

CCTAGTGGAG CTGTGCAAAG AGATCAGACT GTCCTGGAAG GGGGTCCATG CCAGTTGAGC 1500 

TGCAACGTGA AAGCTTCTGA GAGTCCATCT ATCTTCTGGG TGCTTCCAGA TGGCTCCATC 1560 

CTGAAAGCGC CCATGGATGA CCCAGACAGC AAGTTCTCCA TTCTCAGCAG TGGCTGGCTG 162 0 

AGGATCAAGT CCATGGAGCC ATCTGACTCA GGCTTGTACC AGTGCATTGC TCAAGTGAGG 1680 

GATGAAATGG ACCGCATGGT ATATAGGGTA CTTGTGCAGT CTCCCTCCAC TCAGCCAGCC 1740 

GAGAAAGACA CAGTGACAAT TGGCAAGAAC CCAGGGGAGT CGGTGACATT GCCTTGCAAT 180 0 

GCTTTAGCAA TACCCGAAGC CCACCTTAGC TGGATTCTTC CAAACAGAAG GATAATTAAT 1860 

GATTTGGCTA ACACATCACA TGTATACATG TTGCCAAATG GAACTCTTTC CATCCCAAAG 192 0 

GTCCAAGTCA GTGATAGTGG TTACTACAGA TGTGTGGCTG TCAACCAGCA AGGGGCAGAC 1980 

CATTTTACGG TGGGAATCAC AGTGACCAAG AAAGGGTCTG GCTTGCCATC CAAAAGAGGC 204 0 

AGACGCCCAG GTGCAAAGGC TCTTTCCAGA GTCAGAGAAG ACATCGTGGA GGATGAAGGG 2100 

GGCTCGGGCA TGGGAGATGA AGAGAACACT TCAAGGAGAC TTCTGCATCC AAAGGACCAA 2160 

GAGGTGTTCC TCAAAACAAA GGATGATGCC ATCAATGGAG ACAAGAAAGC CAAGAAAGGG 222 0 

AGAAGAAAGC TGAAACTCTG GAAGCATTCG GAAAAAGAAC CAGAGACCAA TGTTGCAGAA 2280 

GGTCGCAGAG TGTTTGAATC TAGACGAAGG ATAAACATGG CAAACAAACA GATTAATCCG 23 4 0 

GAGCGCTGGG CTGATATTTT AGCCAAAGTC CGTGGGAAAA ATCTCCCTAA GGGCACAGAA 24 0 0 

GTACCCCCGT TGATTAAAAC CACAAGTCCT CCATCCTTGA GCCTAGAAGT CACACCACCT 2460 

TTTCCTGCTG TTTCTCCCCC CICAGCATCT CCTGTGCAGA CAGTAACCAG TGCTGAA3AA 252 0 

TCCTCAGCAG ATGTACCTCT ACTTGGTGAA GAAGAGCACG TTTTGGGTAC CATTTCCTCA 2580 

GCCAGCATGG GGCTAGAACA CAACCACAAT GGAGTTATTC TTGTTGAACC TGAAGTAACA 2 64 0 

AGCACACCTC TGGAGGAAGT TGTTGATGAC CTTTCTGAGA AGACTGAGGA GATAACTTCC 27 0 0 

ACTGAACGAG ACCTGAAGGC GACAGCAGCC CCTACACTTA TATCTGAGCC TTATGAACCA 2760 

TCTCCTACTC TGCACACATT AGACACAGTC TATGAAAAGC CCACCCATGA AGAGACGGCA 2820 

ACAGAGGGTT GGTCTGCAGC AGATGTTGGA TCGTCACCAG AGCCCACATC CAGTGAGTAT 2680 

GAGCCTCCAT TGGATGCTGT CTCCTTGGCT GAGTCTGAGC CCATGCAATA CTTTGACCCA 2940 

GATTTGGAGA CTAAGTCACA ACCAGATGAG GATAAGATGA AAGAAGACAC CTTTGCACAC 30 00 

CTTACTCCAA CCCCCACCAT CTGGGTTAAT GACTCCAGTA CATCACAGTT ATTTGAGGAT 3060 

TCTACTATAG GGGAACCAGG TGTCCCAGGC CAATCACATC TACAAGGACT GACAGACAAC 3120 

ATCCACCTTG TG AAAAG TAG TCTAAGCACT CAAGACACCT TACTGATTAA AAAGGGTATG 3180 

AAAGAGATGT CTCAGACACT ACAGGGAGGA AATATGCTAG AGGGAGACCC CACACACTCC 324 0 

AGAAGTTCTG AGAGTGAGGG CCAAGAGAGC AAATCCATCA CTTTGCCTGA CTCCACACTG 33 00 

GGTATAATGA GCAGTATGTC TCCAGTTAAG AAGCCTGCGG AAACCACAGT TGGTACCCTC 33 6 0 

CTAGACAAAG ACACCACAAC AGTAACAACA ACACCAAGGC AAAAAGTTGC TCCGTCATCC 3420 

ACCATGAGCA CTCACCCTTC TCGAAGGAGA CCCAACGGGA GAAGGAGATT ACGCCCCAAC 34 80 

AAATTCCGCC ACCGGCACAA GCAAACCCCA CCCACAACTT TTGCCCCATC AGAGACTTTT 3540 

TCTACTCAAC CAACTCAAGC ACCTGACATT AAGATTTCAA GTCAAGTGGA GAGTTCTCTG 3600 

GTTCCTACAG CTTGGGTGGA TAACACAGTT AATACCCCCA AACAGTTGGA AATGGAGAAG 3660 

AATGCAGAAC CCACATCCAA GGGAACACCA CGGAGAAAAC ACGGGAAGAG GCCAAACAAA 3720 

CATCGATATA CCCCTTCTAC AGTGAGCTCA AGAGCGTCCG GATCCAAGCC CA3CCCTTCT 3780 

A AACATAGAAA CATTGTTACT CCCAGTTCAG AAACTATACT TTTGCCTAGA 384 0 

C TGAAAACTGA GGGCCCTTAT GATTCCTTAG ATTACATGAC AACCACCAGA 3 90 0 

T CATCTTACCC TAAAGTCCAA GAGACACTTC CAGTCACATA TAAACCCACA 396 0 

TCAGATGGAA AAGAAATTAA GGATGATGTT GCCACAAATG TTGACAAACA TAAAAGTGAC 4 020 

ATTTTAGTCA CTGGTGAATC AATTACTAAT GCCATACCAA CTTCTCGCTC CTTGGTCTCC 4 080 

G AATTTAAGGA AGAATCCTCT CCTGTAGGCT TTCCAGGAAC TCCAACCTGG 414 0 

A GGACGGCCCA GCCTGGGAGG CTACAGACAG ACATACCTGT TACCACTTCT 42 00 

C TTACAGACCC TCCCCTTCTT AAAGAGCTTG AGGATGTGGA TTTCACTTCC 4260 

GAGTTTTTGT CCTCTTTGAC AGTCTCCACA CCATTTCACC AGGAAGAAGC TGGTTCTTCC 43 2 0 

ACAACTCTCT CAAGCATAAA AGTGGAGGTG GCTTCAAGTC AGGCAGAAAC CACCACCCTT 4380 

GATCAAGATC ATCTTGAAAC CACTGTGGCT ATTCTCCTTT CTGAAACTAG ACCACAGAAT 4440 

CACACCCCTA CTGCTGCCCG GATGAAGGAG CCAGCATCCT CGTCCCCATC CACAATTCTC 4500 

ATGTCTTTGG GACAAACCAC CACCACTAAG CCAGCACTTC CCAGTCCAAG AATATCTCAA 4560 

GCATCTAGAG ATTCCAAGGA AAATGTTTTC TTGAATTATG TGGGGAATCC AGAAACAGAA 4620 

GCAACCCCAG TCAACAATGA AGGAACACAG CATATGTCAG GGCCAAATGA ATTATCAACA 4680 

CCCTCTTCCG ACCGGGATGC ATTTAACTTG TCTACAAAGC TGGAATTGGA AAAGCAAGTA 4 740 

TTTGGTAGTA GGAGTCTACC ACGTGGCCCA GATAGCCAAC GCCAGGATGG AAGAGTTCAT 4 800 

GCTTCTCATC AACTAACCAG AGTCCCTGCC AAACCCATCC TACCAACAGC AACAGTGAGG 4860 

CTACCTGAAA TGTCCACACA AAGCGCTTCC AGATACTTTG TAACTTCCCA GTCACCTCGT 4 920 

CACTGGACCA ACAAACCGGA AATAACTACA TATCCTTCTG GGGCTTTGCC AGAGAACAAA 4980 

CAGTTTACAA CTCCAAGATT ATCAAGTACA ACAATTCCTC TCCCATTGCA CATGTCCAAA 5 04 0 
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CCCAGCATTC CTAGTAAGTT TACTGACCGA AGAACTGACC AA7TCAATGG TTACTCCAAA 5100 

GTGTTTGGAA ATAACAACAT CCCTGAGGCA AGAAACCCAG TTGGAAAGCC 7CCCAGTCCA 5160 

AGAATTCCTC ATTATTCCAA TGGAAGACTC CC1TTCTTTA CCAACAAGAC TCTTTCTTTT 5220 

CCACAGTTGG GAGTCACCCG GAGACCCCAG ATACCCACTT CTCCTGCCCC AGTAATGAGA 52 80 

GAGAGAAAAG TTATTCCAGG TTCCTACAAC AGGATACATT CCCATAGCAC CTTCCATCTG 5340 

GACTTTGGCC CTCCGGCACC TCCGTTGTTG CACACTCCGC AGACCACGGG ATCACCCTCA 5400 

ACTAACTTAC AGAATATCCC TATGGTCTCT TCCACCCAGA GTTCTATCTC CTTTATAACA 54 SO 

TCTTCTGTCC AGTCCTCAGG AAGCTTCCAC CAGAGCAGCT CAAAGT7CTT 7GCAGGAGGA 5520 

CCTCCTGCAT CCAAATTCTG GTCTCTTGGG GAAAAGCCCC AAATCCTCAC CAAGTCCCCA 5580 
CAGACTGTGT CCGTCACCGC TGAGACAGAC ACTGTGTTCC CCTGTOtf 
CCAAAGCCTT TCGTTACTTG GACAAAGGTT TCCACAGGAG CTCTTATGAC T 

AGGATACAAC GGTTTGAGGT TCT CAAGAAC GGTACCTTAG TGATACGGAA GGTTCAAGTA 57S0 

CAAGATCGAG G CCAGT AT AT GTGCACCGCC AGCAACCTGC ACGGCCTGGA CAGGATGGTG 5820 

GTCTTGCTTT CGGTCACCGT GCAGCAACCT CAAATCCTAG CCTCCCACTA CCAGGACGTC 5880 

ACTGTCTACC TGGGAGACAC CATTGCAATG GAGTGTCTGG CCAAAGGGAC CCCAGCCCCC 5940 



CCGGGGCTCA GCATTCACAT TCACTGCACT GCCAAGGCTG CGCCCCTGCC CAGCGTGCGC 
TGGGTGCTCG GGGACGGTAC CCAGATCCGC CCCTCGCAGT TCCTCCACGG GAACTTGTTT 
GTTTTCCCCA ACGGGACGCT CTACATCCGC AACCTCGCGC CCAAGGACAG CGGGCGCTAT 
GAGTGCGTGG CCGCCAACCT GGTAGGCTCC GCGCGCAGGA CGGTGCi 
CGTGCAGCAG CCAACGCGCG CATCACGGGC ACCTCCCCGC GGAGGACGGA C 
GGAGGAACCC TCAAGCTGGA CTGCAGCGCC TCGGGGGACC CCTGGCC 
AGGCTGCCGT CCAAGAGGAT GATCGACGCG CTCTTCAGTT TTGATAGCAG A 

TTTGCCAATG GGACCCTGGT GGTGAAATCA GTGACGGACA AAGATGCCGG AGATTACCTG 66S0 

TGCGTAGCTC GAAATAAGGT TGGTGATGAC TACGTGGTGC TCAAAGTGGA TGTGGTGATG 6720 

AAACCGGCCA AGATTGAACA CAAGGAGGAG AACGACCACA AAGTCTTCTA CGGGGGTGAC 6780 

CTGAAAGTGG ACTGTGTGGC CACCGGGCTT CCCAATCCCG AGATCTCCTG GAGCCTCCCA 6840 

GACGGGAGTC TGGTGAACTC CTTCATGCAG TCGGATGACA GCGGTGGACG CACCAAGCGC 6900 

TATGTCGTCT T CAACAATGG GACACTCTAC TTTAACGAAG TGGGGATGAG GGAGGAAGGA 6960 

GACTACACCT GCTTTGCTGA AAATCAGGTC GGGAAGGACG AGATGAGAGT CAGAGTCAAG 7020 

GTGGTGACAG CGCCCGCCAC CATCCGGAAC AAGACTTACT TGGCGGTTCA GGTGCCCTAT 7080 

GGAGACGTGG TCACTGTAGC CTGTGAGGCC AAAGGAGAAC CCATGCCCAA GGTGACTTGG 7140 

TTGTCCCCAA CCAACAAGGT GATCCCCACC TCCTCTGAGA AGTATCAGAT ATACCAAGAT 72 00 

C TTATTCAGAA AGCCCAGCGT TCTGACAGCG GCAACTACAC CTGCCTGGTC 72 SO 

IAGAGGA TAGGAAGACG GTGTGGATTC ACGTCAACGT CCAGCCACCC 7320 

iCCCCAA CCCCATCACC ACCGTGCGGG AGATAGCAGC CGGGGGCAGT 73 BO 

A TTGACTGCAA AGCTGAAGGC ATCCCCACCC CGAGGGTGTT ATGGGCTTTT 7440 

:TCTGCC AGCTCCATAC TATGGAAACC GGATCACTGT CCATGGCAAC 7500 

GGTTCCCTGG ACATCAGGAG TTTGAGGAAG AGCGACTCCG TCCAGCTGGT ATGCATGGCA 75S0 

CGCAACGAGG GAGGGGAGGC GAGGTTGATC GTGCAGCTCA CTGTCCTGGA GCCCATGGAG 762 0 

AAACCCATCT TCCACGACCC GATCAGCGAG AAGATCACGG CCATGGOGSG CCACACCATC 7680 

AGCCTCAACT GCTCTGCCGC GGGGACCCCG ACACCCAGCC TGGTGTGGGT CCTTCCCAAT 7740 

GGCACCGATC TGCAGAGTGG ACAGCAGCTG CAGCGCTTCT ACCACAAGGC TGACGGCATG 7800 

CTACACATTA GCGGTCTCTC CTCGGTGGAC GCTGGGGCCT ACCGCTGCGT GGCCCGCAAT 786 0 

GCCGCTGGCC ACACGGAGAG GCTGGTCTCC CTGAAGGTGG GACTGAAGCC AGAAGCAAAC 7920 

AAGCAGTATC ATAACCTGGT CAGCATCATC AATGGTGAGA CCCTGAAGCT CCCCTGCACC 7980 

CCTCCCGGGG CTGGGCAGGG ACGTTTCTCC TGGACGCTCC CCAATGGCAT GCATCTGGAG 8040 

GGCCCCCAAA CCCTGGGACG CGTTTCTCTT CTGGACAATG GCACCCTCAC GGTTCGTGAG 8100 

GCCTCGGTGT TTGACAGGGG TACCTATGTA TGCAGGATGG AGACGGAGTA CGGCCCTTCG 81S0 
GTCACCAGCA TCCCCGTGAT TGTGATCGCC TATCCTCCCC GGATCAC 
CCGGTCATCT ACACCCGGCC CGGGAACACC GTGAAACTGA ACTGCA1 
CCCAAAGCTG ACATCACGTG GGAGTTACCG GATAAGTCGC ATCTGAAGGC A 

ACACAGAGAG ATGCCGGCTT CTACAAGTGC ATGGCAAAAA ACATTCTCGG C 

AAAACAACTT ACATCCACGT CTTCTGAAAT GTGGATTCCA GAATGATTGC TTAGGAACTG 8520 

ACAACAAAGC GGGGTTTGTA AGGGAAGCCA GGTTGGGGAA TAGGAGCTCT TAAATAATGT 85B0 

GTCACAGTGC ATGGTGGCCT CTGGTGGGTT TCAAGTTGAG GTTGATCTTG ATCTACAATT 8640 

GTTGGGAAAA GGAAGCAATG CAGACACGAG AAGGAGGGCT CAGCCTTGCT GAGACACTTT 8700 

CTTTTGTGTT TACATCATGC CAGGGGCTTC ATTCAGGGTG TCTGTGCTCT GACTGCAATT 8760 

TTTCTTCTTT TGCAAATGCC ACTCGACTGC CTTCATAAGC GTCCATAGGA TATCTGAGGA 8820 

ACATTCATCA AAAATAAGCC ATAGACATGA ACAACACCTC ACTACCCCAT TGAAGACGCA 88B0 

TCACCTAGTT AACCTGCTGC AGTTTTTACA TGATAGACTT TGTTCCAGAT TGACAAGTCA 8940 

TCTTTCAGTT ATTTCCTCTG TCACTTCAAA ACTCCAGCTT GCCCAA7AAG GATTTAGAAC 9000 

CAGAGTGACT GATATATATA TATATATTTT AATTCAGAGT TACATACATA CAGCTACCAT 90 SO 

TTTATATGAA AAAAGAAAAA CATTTCTTCC TGGAACTCAC TTTTTATATA ATGTTTTATA 9120 

TATATATTTT TTCCTTTCAA ATCAGACGAT GAGACTAGAA GGAGAAATAC TTTCTGTCTT 9180 

ATTAAAATTA ATAAATTATT GGTCTTTACA AGACTTGGAT ACATTACAGC AGA CATGGAA 9240 

ATATAATTTT AAAAAATTTC TCTCCAACCT CCTTCAAATT CAGTCACCAC ' TGTTATATTA 9300 

CCTTCTCCAG GAACCCTCCA GTGGGGAAGG CTGCGATATT AGATTTCCTT GTATGCAAAG 9360 

X AAGCTGTGCT CAGAGGAGGT GAGAGGAGAG GAAGGAGAAA ACTGCATCAT 9420 

TGAATCT AGAGTCTTCC CCGAAAAGCC CAGAAACTTC TCTGCAGTAT 94 80 

Z CATCTGGTCT AAGGTGGCTG CTTCTTCCCC AGCCATGAGT CAGTTTGTGC 9540 

ICGACCT GTTATTTCCA TGACTGCTTT ACTGTATTTT TAAGGTCAAT 9600 
ATACTGTACA TTTGATAATA AAATAATATT CTCCCAAAAA AAAAA 

Seq ID NO: 417 Protein sequence 



L SWLILLWGH PRVALACPHP CACYVPSEVH CTFRSLA3VP AGIARHVERI 
NLGFNSIQAL SETSFAGLTK LELLMIHGNE IPSIPDGALR DLSSLQVFKF SYKKLRVITG 
QTLQGLSNLM RLHIDHNKIE FIHPQAFNGL TSLRLLHLEG NLLHQLHPST FSTFTFLDYF 
RLSTIRHLYL AENMVRTLPA SMLRNMPLLE NLYLQGNPWT CDCEMRWFLE WDAKSRGILK 
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CKKDKAYEGG QLCAMCFSPK KLYKHEIHKL KDMTCLKPSI E3PLRQNRSR SIEEEQEQEE 300 

DGGSQLILEK FQLPQWSISL NMTDEHGNMV NLVCDIKKPM DVYKIHLNQT DPPDIDINAT 360 

VRAQILAEPE WVMQPSIDIQ LNRRQSTAKK VLLSYYTQYS QTISTKDTRQ ARGRSWVMIE 480 

PSGAVQRDQT VLEGGPCQLS CNVKASESPS IFWVLPDGSI LKAPMDDPDS KFSILSSGWL 540 

RIKSMEPSDS GLYQCIAQVR DEMDRMVYRV LVQSPSTQPA EKDTVTIGKN PGESVTLPCN 600 

ALAIPEAHLS WILPNRRIIN DLANTSHVYM LPNGTLSIPK VQVSDSGYYR CVATOIQQGAD 660 

HFTVGITVTK KGSGLPSKRG RRPGAKALSR VREDIVEDEG G3GHGDEENT SRRLLHPKDQ 720 

k INGDKKAKKG RRKLKLWKHS EKEPETNVAE GRRVFESRRR INMANKQINP 780 

7 RGKNLPKGTE VPPLIKTTSP PSLSLEVTPP FPAVSPPSAS FVQTVTSAEE 840 

3 EEHVLGTISS ASMGLEHNHN GVILVEPEVT STFLEEWDD LSEKTEEITS 900 



EPPLDAVSLA ESEPMQYFDP DLETKSQPDE DKMKEDTFAH LTFTPTIWVN DSSTSQLFED 1020 

STIGEPGVPG QSHLQGLTDN IHLVKSSLST QDTLLIKKGM KEMSQTLQGG NHLEGDPTHS 1080 

RSSESEGQES KSITLPDSTL GIMSSMSPVK KPAETTVGTL LDKDTTIV7T TPRQKVAPSS 1140 

TMSTHPSRRR PNGRRRLRPN KFRHRHKQTP PTTFAPSETF STQPTQAPDI KISS^QVESSL 1200 

VPTAWVDNTV NTPKQLEMEK NAEPTSKGTP RRKHGKRPNK HRYTPSIVSS RASGSKPSPS 1260 

PENKHRNIVT PSSETILLPR TVSLKTEGPY DSLDYMTTTR KIYSSYPKVQ ETLPVTYKPT 1320 

SDGKEIKDDV ATNVDKHKSD ILVTGESITN AIPTSRSLVS TMGEFKEESS FVGFPGTPTW 1380 

NPSRTAQPGR LQTDIPVTTS GENLTDPPLL KELEDVDFTS SFLSSLTVST PFHQEEAGSS 1440 

TTLSSIKVEV ASSQAETTTL DQDHLETTVA ILLSETRPQN HTPTAARMKS PASSSPSTIL 1500 

MSLGQTTTTK PALPSPRISQ ASRDSKENVF LNYVGNPSTE ATPVIMEGTQ KMSGPNELST 1560 

PSSDRDAFNL STKLELEKQV FGSRSLPRGP DSQRQDGRVH ASKQLTRVPA KPILPTATVR 1620 

LPEMSTQSAS RYFVTSQSPR HWTNKPEITT YPSGALPENK QFTTPRLSST TIPLPLHMSK 1680 

PSIPSKFTDR RTDQFNGYSK VFGNNNIPEA RNPVGKPPSP RIPHYSNGRL PFFTNKTLSF 174 0 

PQLGVTRRPQ IPTSPAPVMR ERKVIPGSYN RIHSHSTFHL DFGPPAPPLL KTPQTTGSPS 1800 

TNLQNIPMVS STQSSISFIT SSVQSSGSFH QSSSKFFAGG PPASKFKSLG EKPQILTKSP 1860 

QTVSVTAETD TVFPCEATGK PKPFVTWTKV STGALMTPNT RIQRFEVLKN GTLVIRKVQV 1920 

QDRGQYMCTA SNLHGLDRMV VLLSVTVQQP QILASHYQDV TVYLGDTIAM ECLAKGTPAP 1980 

QISWIFPDRR VWQTVSPVES RITLHENRTL SIKEASF3DR GVYKCVASNA AGADSLAIRL 2040 

HVAALPPVIH QEKLENISLP PGLSIHIHCT AKAAPLPSUR WVLGDGTQIR PSQFLHGNLF 2100 

VFPNGTLYIR NLAPKDSGRY ECVAANLVGS ARRTVQLNVQ RAAANARITG TSPRRTDVRY 2160 

GGTLKLDCSA SGDPWPRILW RLPSKRMIDA LFSFDSRIKV FANGTLWKS VTDKDAGDYL 2220 

CVARNKVGDD YWLKVDWM KPAKIEHKEE NDHKVFYGGD LKVDCVATGL PNPEISWSLP 2280 

DGSLVNSFHQ SDDSGGRTKR YWFNNGTLY FNEVGMREEG DYTCFAENQV GKDEMRVRVK 2340 

WTAPATIRN KTYLAVQVPY GDWTVACEA KGEPMPKVTW LSPTNKVIPT SSEKYQIYQD 24 00 

GTLLIQKAQR SDSGNYTCLV RHSAGEDRKT VWIHVNVQPP KINGNPNPIT TVREIAAGGS 2460 

RKLIDCKAEG IPTPRVLWAF PEGWLPAPY YGNRITVHGN GSLDIRSLRK SDSVQLVCMA 252 0 

RNEGGEARLI VQLTVLEPME KPIFHDPISE KITAMAGHTI SLNCSAAGTP TPSLVWVLPN 2580 

GTDLQSGQQL QRFYHKADGM LHISGLSSVD AGAYRCVARN AAGHTERLVS LKVGLKPEAN 2640 

KQYHNLVS I I NGETLKLPCT PPGAGQGRFS WTLPNGMHLE GPQTLGRVSL LDNGTLTVRE 2700 

ASVFDRGTYV CRMETEYGPS VTSIPVIVIA YPPRITSEPT PVIYTRPGNT VKLNCMAMGI 2760 

PKADITWELP DKSHLKAGVQ ARLYGNRFLH PQGSLTIQHA TQRDAGFYKC MAKNILG3DS 2B2 0 
KTTYIHVF 



I I I I 1 I 

ATGCCAGGCA CAAAACTAAC CCGAACAGGC GCCCCAGCAG ACTACAGAGT GATATTGAAG 60 

ACCTCTCAAG AGGACGAATT GGATGTACCT GACGACATCA GCGTCCGGGT TATGTCATCT 120 

CAGTCTGTGC TTGTGTCCTG GGTGGATCCT GTTCTGGAAA AACAGAAGAA AGTTGTTGCA 180 

TCAAGACAGT ACACCGTGCG CTATCGAGAG AAGGGGGAAT TGGCCAGGTG GGATTATAAG 24 0 

CAGATCGCTA ACAGGCGTGT GCTGATTGAG AACCTGATTC CAGACACTGT GTATGAATTT 3 00 

GCAGTCCGTA TTTCACAGGG TGAAAGAGAT GGCAAATGGA GTACGTCAGT CTTCCAAAGA 3 60 

ACACCAGAAT CTGCCCCTAC CACAGCTCCT GAAAACTTGA ACGTCTGGCC AGTCAATGGC 420 

AAACCTACAG TTGTCGCTGC ATCTTGGGAT GCGCTACCAG AGACTGAGGG GAAAGTGAAA 480 

GTCTGTCTGC TGGACACAGG ACTGTTTTCA GTTTCCTCCT TCCAACCATC TGCCAAATCA 54 0 

TTTCAGAATA CATTCTTTCA TACGCCCCGG CTCTCAAACC ATTTGGAGCA AAGTCCCTCA 600 

CCTATCCTGG AGACACTACT TCTGCCCTGG TGGATGGTCT GCAGCCTGGG GAACGCTATC 660 

TTTTCAAAAT CCGGGCCACA AACAGGAGAG GCCTGGGACC TCACTCCAAA GCCTTCATTG 72 0 

TCGCTATGCC AACAAGAATG CAGCTGTACC CAGAAGGATT TCAGTTGTCT AGCTTACCTG 780 

ATCGATATCC AAACCAAACA AGTTAATAAA GATCCACAAC TGGAAGGGAG TGTTT7TGGA 64 0 

CCATGTTTTC TTTTCTACTT CCTCACATTT ATGCTGGATA TTGGCGGCTT TTCCTTCATT 900 

ATGTGCTATG AAGACCCANN TGTTTCTTCT TTGACAGGCA ATTCTTTAAA A7CTGTTGCA 96 0 

GCCAGTAAGG CGGATGTTCA GCAGAACACG GAGGACAATG GGAAACCCGA AAAACCTGAG 1020 

CCTTCCTCAC CTTCTCCCAG AGCTCCAGCT TCCTCCCAAC ACCCCTCTGT GCCTGCTTCT 1080 

A GAAATGCCAA GGACCTTCTT CTTGACTTGA AGAACAAAAT A7TGGCTAAT 114 0 

: CCCGAAAACC CCAGCTTCGC GCCAAGAAGG CAGAGGAGCT GGATCTTCAG 1200 

:TGGGGA GGAGGAGCTG GGTTCCCGGG AGGACTCGCC CATGTCACCC 1260 

TCAGACACCC AAGACCAGAA ACGGACCCTG AGGCCGCCAA GTAGACACGG CCACTCGGTG 13 2 0 

GTTGCTCCCG GCAGGACTGC AGTGAGGGCC CGGATGCCAG CGCTGCCCCG AAGGGAAGGC 13 80 

GTAGATAAGC CTGGCTTTTC CCTGGCCACG CAGCCCCGCC CAGGG3CGCC CCCCTCGGCT 1440 

TCGGCCTCTC CTGCCCACCA CGCGTCCACC CAGGGCACC7 CTCATCGTCC T7CCC7GCCT 1500 

GCCAGCTTGA ATGACAACGA CTTGGTGGAC TCAGACGAAG ATGAGCGCGC TGTGGGCTCC 1560 

CTCCACCCCA AGGGCGCCTT CGCCCAGCCC CGGCCAGCCC TGTCCCCCAG CCGCCAGTCC 1620 

CCGTCCAGCG TTCTCCGCGA CAGAAGCTCT GTGCACCCCG GCGCAAAGCC AGCCTCGCCG 1680 

GCGCGGAGGA CCCCCCATTC AGGGGCCGCA GAGGAAGAT7 CCAGTGCCTC AGCCCCACCC 174 0 

TCAAGACTTT CTCCACCCCA TGGGGGATCA TCTCGGCTGC TGCCCACCCA GCCACACCTG 1800 

AGCTCTCCAC TTTCCAAGGG CGGGAAGGAT GGTGAGGACG CCCCAGCCAC CAACTCCAAT 186 0 

GCGCCATCAC GGTCCACCAT GTCCTCCTCC GTCTCTTCTC ATCTCTCGTC CAGGACGCAG 192 0 

GTCTCTGAGG GAGCGGAGGC TTCTGATGGT GAAAGCCACG GTGACGGCGA TAGGGAAGAC 1980 

GGCGGAAGGC AGGCGGAGGC CACGGCCCAG ACGCTGCGGG CCCGGCCTGC CTCTGGACAC 2040 

TTCCATTTGC TCAGACA CAA ACCCTTTGCT GCCAACGGGA GG7CTCCAAG CAGGTTCAGC 210 0 

ATTGGGCGGG GACCTCGGCT GCAGCCCTCC AGCTCCCCAC AGTCGACTGT GCCCTCCCGA 2160 
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GCCCACCCCA GGGTTCCCTC TCACTCTGAT TCCCACCCTA AGCTTAGCTC AGGTATCCAT 222 0 

GGAGACGAGG AGGATGAGAA GCCGCTTCCT GCCACCGTTG TCAATGACCA CGTGCCTTCC 2280 

TCCTCCAGGC AGCCCATCTC CCGGGGCTGG GAGGACTTAA GGAGAAGCCC GCA3AGAGGG 2340 

GCCAGCCTGC ATCGGAAGGA ACCCATCCCA GAGAACCCCA AATCCACAGG GGCAGATACA 24 0 0 

5 CATCCTCAGG GCAAGTACTC CTCCCTGGCC TCCAAGGCTC AGGATGTTCA ACAGAGCACA 2460 

GACGCGGACA CGGAGGGTCA TTCTCCCAAA GCACAGCCAG GGTCCACAGA CCGCCACGCG 2520 

TCCCCTGCTC GTCCTCCCGC AGCACGGTCA CAGCAGCATC CCAGTGTTCC CAGAAGGATG 2580 

ACACCCGGCC GGGCCCCAGA ACAGCAGCCC CCTCCTCCCG TCGCCACGTC CCAGCACCAC 264 0 

CCGGGACCCC AGAGCAGAGA CGCGGGTCGG TCACCTTCCC AGCCCAGGCT CTCACTGACC 2700 

10 CAGGCCGGGC GGCCCCGCCC CACGTCGCAG GGCCGCTCCC ACTCCICCTC GGACCCTTAC 2760 

ACGGCGAGCT CCAGAGGGAT GCTCCCCACG GCCCTCCAGA ACCAG3ACGA GGATGCCCAG 282 0 

GGCAGCTACG ACGACGACAG CACAGAAGTC GAGGCCCAGG ATGTGCGGGC CCCCGCGCAC 2880 

GCCGCGCGCG CCAAGGAGGC AGCTGCGTCC CTTCCCAAGC ACCAGCAGGT GGAGTCTCCC 294 0 

ACAGGCGCAG GGGCAGGTGG CGACCACAGG TCCCAGC3CG GACAT3CGGC C7CCCCCGCC 3000 

15 AGGCCCAGCC GACCCGGCGG CCCCCAGTCC CGCGCCCGGG TCCCCAGCAG GGCAGCGCCG 3060 

GGGAAGTCGG AGCCTCCTTC CAAGCGGCCC CTGTCCTCCA AGTCCCAGCA GTCGGTCTCA 3120 

GCCGAGGACG AGGAGGAGGA GGACGCGGGG TTTTTTAAAG GCGGGAAAGA AGACCTTCTG 3180 

TCTTCCTCTG TGCCAAAGTG GCCCTCTTCC TCCACTCCCA GGGGCGGCAA AGACGCCGAT 3240 

GGGAGCCTCG CCAAGGAAGA GAGGGAGCCT GCCATCGCGC TTGCCCCTCG CGGAGGGAGC 3300 

20 CTGGCTCCTG TGAAGCGACC TCTCCCCCCA CCTCCAGGCA GCTCCCCCAG GGCCTCCCAC 3360 

GTCCCTTCCC GACCGCCGCC TCGCAGCGCT GCCACCGTGA GCCCCGTCGC GGGCACCCAC 3420 

CCCTGGCCGC GGTACACCAC GCGCGCCCCV CCTGGCCACT TCTCCACCAC CCCGATGCTG 3480 

TCCTTGCGCC AGAGGATGAT GCATGCCAGA TTCCGTAACC CTCTCTCCCG ACAGCCTGCC 354 0 

AGACCCTCTT ACAGACAAGG TTATAATGGC AGACCAAATG TAGAAGGGAA AGTCCTTCCT 3600 

25 GGTAGTAATG GAAAACCGAA TGGACAGAGA ATTATCAATG GCCCTCAAGG AACAAAGTGG 3660 

GTTGTGGACC TTGATCGTGG GTTAGTATTG AATGCAGAAG GAAGGTACCT CCAAGATTCA 3 72 0 

CATGGAAATC CTCTTCGGAT TAAACTAGGA GGAGATGGTC GAACCATTGT AGATCTGGAA 3780 

GGGACCCCCG TGGTGAGTCC TGACGGCCTC CCACTCTTTG GGCAG3GGCG ACATGGCACA 384 0 

CCTCTGGCCA ATGCCCAAGA TAAGCCAATT TTGAGTCTTG GAGGAAAGCC GCTGGTGGGC 3900 

30 TTGGAGGTCA TCAAAAAAAC CACCCATCCC CCTACCACTA CCATGCAGCC CACCACTACT 3960 

ACGACGCCCC TGCCTACCAC TACAACCCCG AGGCCCACCA CTGCCACCAC CATGCAGCCC 4020 

ACCACTACTA CGACGCCCCT GCCTACCACT ACACCGAGGC CCACCACTGC CACCACCCGC 4080 

CGCACGACCA CCAGGCGTCC AACAACCACA GTCCGAACCA CTACGCGGAC AACCACCACC 414 0 

ACCACCCCCA AACCCACCAC TCCCATCCCC ACCTGTCCCC CTGGGACCTT GGAACGGCAC 4200 

35 GACGATGATG GCAACCTGAT AATGAGCTCC AATGGGATCC CAGAGTGCTA CGCTGAAGAA 4260 

GATGAGTTCT CAGGCTTGGA GACTGACACT GCAGTACCTA CGGAAGAGGC CTACGTTATA 432 0 

TATGATGAAG ATTATGAATT TGAGACGTCA AGGCCACCAA CCACCACTGA GCCTTCGACC 4380 

ACTGCTACCA CACCGAGGGT GATCCCAGAG GAAGGCGCCA TCAGTTCCTT TCCTGAAGAA 444 0 

GAATTTGATC TGGCTGGAAG GAAACGATTT GTTGCTCCTT ACGTGACGTA CCTAAATAAA 450 0 

40 GACCCATCAG CCCCGTGCTC TCTGACTGAT GCACTGGATC ACTTCCAAGT GGACAGCCTG 4560 

GATGAAATCA TCCCCAATGA CCTGAAGAAG AGTGATCTGC CTCCCCAGCA TGCTCCCCGC 462 0 

AACATCACCG TGGTGGCCGT GGAAGGTTGC CACTCATTTG TCATTGTGGA TTGGGACAAA 4680 

GCCACCCCAG GAGATTTGGT CACAGGTTAT TTGGTTTACA GTGCATCCTA TGAAGATTTC 4740 

ATCAGGAACA AGTTTTCCAC TCAAGCTTCA TCAGTAACTC ACTTGCCCAT TGAGAACCTA 48 0 0 

45 AAGCCCAACA CGAGGTATTA TTTTAAAGTG CAAGCACAAA ATCCTCATGG CTACGGACCT 4860 

ATCAGCCCTT CGGTCTCATT TGTCACCGAA TCAGATAATC CTCTGCTTGT TGTGAGGCCC 492 0 

CCAGGOGGTG AGCTATCTGG ATCCCATTCG CTTTCAAACA TGATCCCAGC TACACGGACT 498 0 

GTAATTCACT GAGGTATAAA ATCTACCTCA GTGACAACCT GAAAGATACA TTCTACAGCA 510 0 

50 TTGGAGACAG CTGGGGAAGA GGTGAAGACC ATTGCCAATT TGTGGATTCA CACCTTGATG 5160 

GAAGAACAGG GCCTCAGTCC TATGTAGAAG CCCTCCCTAC TATTCAAGGC TACTATCGCC 52 2 0 

AGTATCGTCA GGAGCCTGTC AGGTTTGGGA ACATCGGCTT CGGAACCCCC TACTACTATG 528 0 

TGGGCTGGTA CGAGTGTGGG GTCTCCATCC CTGGAAAGTG GTAATCACAG GACCGTCATG 5340 

CTGCAAGCTT GCCCTGCCCA GCCCCACCAA CTAAGTCGCA CTAGGGGCTG TGAGCAAAGA 5400 

55 CAGCCAGCAT GCTCAGCCCC GCTGCCCTAG GTGCCAGGAA GGTCACAGAT GGACACTGGC 546 0 

CATTCTGGTC ATCTCAGTCT GGAACTCAGT CCCACTTCTT GGCCTGGACA ATGAACAGGA 5520 

TTCAGTTTTG CTGTTAACTT TGCTTCTCTA CTTTTTTTTG TTTGTTTGTA ATAGCACATC 5580 

CCAGAGACAT CAGAAACCAG CAACTGATTC AGTGTGATTT CCCAGACTTT TTAGGCATGA 5640 

AATTCGGACA CTTCAGTATT TCCAGGAATA GCATATGCAC GCTGTTCTTG CTTCATGGAA 5700 

60 TGCTACATGC TTTCTGTTTT TCTCATTTTG GATTTCTCCA AAACTAACTG AATTTAAGCT 5760 

TCAGGTCCCT TTGTATGCAG TAGAAAGGAA TTATTAAAAA CACCACCAAA GAAAATAAAT 582 0 

ATATCCTACT TGAAATTTAC TCTATGGACT TACCCACTGC TAGAATAAAT GTATCAAATC 5880 

TTATTTGTAA ftTTCTC AATT TTGATATATA TATGTATATA TGCATATACA TATCCACACT 5940 

TGTCTGCAAG AATATTGATT AAAATTGCTA AATTTGTACT TGTTCACCAA AAAAAAAAAA 600 0 
65 AAAAAAA 

Seq ID NO: 419 Protein sequence 
Protein Accession th Eos sequence 

70 i 

I I I I I I 

MPGTKLTRTG APADYRVILK TSQEDELDVP DDISVRVMSS O.SVLVSWVDP VLEKQKKWA 60 

SRQYTVRYRE KGELARWDYK QIANRRVLIE NLIPDTVYEF AVRI3QGERD GKWSTSVFQR 120 

TPESAPTTAP ENLNVWPVNG KPTWAASWD ALPETEGKVK VCLLDTGLFS VSSFQPSAKS 180 

75 FQNTFFHTPR LSNHLEQSPS PILETLLLPW WMVCSLGNAI FSKSGPQ7GE AWDLTPKPSL 240 

SLCQQECSCT QKDFSCLAYL IDIQTKQVNK DPQLEGSVFG PCFLFYFLTF MLDIGGFSFI 300 

MCYEDPVSSL TGNSLKSVAA SKADVQQNTE DNGKPEKPEP SSPSPRAPAS SQHPSVPASP 3 60 

QGRNAKDLLL DLKNKILANG GAPRKPQLRA KKAEELDLQS TEITGEEELG SREDSPMSPS 42 0 

DTQDQKRTLR PPSRHGHSW APGRTAVRAR MPALPRREGV DKPGFSLATQ PRPGAPPSAS 480 

80 ASPAHHASTQ GTSHRPSLPA SLHDNDLVDS DEDERAVGSL HPKGAFAQPR PALSPSRQSP 540 

SSVLRDRSSV HPGAKPASPA RRTPHSGAAE EDSSASAPPS RLSPPHGGSS RLLPTQPHLS 600 

SPLSKGGKDG EDAPATNSNA PSRSTMSSSV SSHLSSRTQV SEGAEASDGE SHGDGDREDG 660 

GRQAEATAQT LRARPASGHF HLLRHKPFAA NGRSPSRFSI GRGPRLQPSS SPQSTVPSRA 720 

HPRVPSHSDS HPKLSSGIHG DEEDEKPLPA TWNDHVPSS SRQPISRGWE DLRRSPQRGA 780 

85 SLHRKEPIPE NPKSTGADTH PQGKYSSLAS KAQDVQQSTD ADTEGHSPXA QPGSTDRHAS 840 

PARPPAARSQ QHPSVPRRMT PGRAPEQQPP PPVATSQHHP GPQSRDAGRS PSQPRLSLTQ 90 0 

AGRPRPTSQG RSHSSSDPYT ASSRGMLPTA LQNQDEDAQG SYDDDSTEVE AQDVRAPAKA 960 
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QRGHAASPAR PSRPGGPQSR ARVPSRAAPG 1020 



h SSKSQQSVSA EDEEEEDAGF FKGGKEDLLS SSVPKKPSSS TPRGGKDADG 1080 

\ IALAPRGGSL APVKRPLPPP PGSSPRASHV PSRPPPRSAA TVSPVAGTHP 1140 

WPRYTTRAPP GHFSTTPMLS LRQRMMHARF RNPLSRQPAR FSYRQGYNGR PNVEGKVLPG 1200 

SNGKPNGQRI INGPQGTKWV VDLDRGLVLN AEGRYLQDSH GNPLRIK1GG DGRTIVDLEG 12E0 

TPWSPDGLP LFGQGRHGTP LANAQDKPIL SLGGKPLVGL EVIKKTTHPP TTTMQPTTTT 1320 

TPLPTTTTPR PTTATTMQPT TTTTPLPTTT PRPTTATTRR TTTRRPTTTV RTTTRTTTTT 13 BO 

TPKPTTPIPT CPPGTLERHD DDGNLIMSSN GIPECYAEED EFSGLETDTA VPTEEAYVIY 1440 

DEDYEFETSR PPTTTEPSTT ATTPRVIPEE GAISSFPEEE FDLAGRKRFV APYVTYLNKD 15 00 

PSAPCSLTDA LDHFQVDSLD EIIPNDLKKS DLPPQHAPRN ITWAVEGCK SFVIVDWDKA 1560 

TPGDLVTGYL VYSASYEDFI RNKFSTQASS VTHLPIENLK PNTRYYFKVQ AQNPHGYGPI 1620 
SPSVSFVTES DNPLLWRPP G 

Seq ID NO: 42 0 DNA sequence 



GTGGATTTTA GAGATACCTC CCCTCCTTCT GCTCAGCTGC CTTGCAGTAA TTAAACTCTT 60 

TCTCTGCTGC AACACCCCTA CTGTTCTCCG TGTATTGGCT TTTCTGGGCA GCAGGAAGGA 12 0 

AAAGCTGATG CGATGCTCTC AGTGCCGCGT CGCCAAATA- TGTAi P, "TGTCAGAA ISO 

AAAAGCTTGG CCAGACCACA AGCGGGAATG CAAATGCCTT AAAAGCTGCA AACCCAGATA 240 

TCCTCCAGAC TCCGTTCGAC TTCTTGGCAG AGTTGTCTTC AAACTTATGG ATGGAGCACC 300 

TTCAGAATCA GAGAAGCTTT ACTCATTTTA TGATCTGGAG TCAAATATTA ACAAACTGAC 360 

TGAAGATAAG AAAGAGGGCC TCAGGCAACT CGTAATGACA TTTCAACATT TCATGAGAGA 42 0 

AGAAATACAG GATGCCTCTC AGCTGCCACC TGCCTTTGAC CTTTTTGAAG CCTTTGCAAA 480 

AGTGATCTGC AACTCTTTCA CCATCTGTAA TGCGGAGATG CAGGAAGTTG GTGTTGGCCT 540 

ATATCCCAGT ATCTCTTTGC TCAATCACAG CTGTGACCCC AACTGTTCGA TTGTGTTCAA 600 

TGGGCCCCAC CTCTTACTGC GAGCAGTCCG AGACATCGAG GTGGGAGAGG AGCTCACCAT 660 

CTGCTACCTG GATATGCTGA TGACCAGTGA GGAGCGCCGG AAGCAGCTGA GGGACCAGTA 720 

CTGCTTTGAA TGTGACTGTT TCCGTTGCCA AACCCAGGAC AAGGATGCTG ATATGCTAAC 780 

TGGTGATGAG CAAGTATGGA AGGAAGTTCA AGAATCCCTG AAAAAAATTG AAGAACTGAA 840 

GGCACACTGG AAGTGGGAGC AGGTTCTGGC CATGTGCCAG GCGATCATAA GCAGCAATTC 900 

TGAACGGCTT CCCGATATCA ACATCTACCA GCTGAAGGTG CTCGACTGCG CCATGGATGC 960 

CTGCATCAAC CTCGGCCTGT TGGAGGAAGC CTTGTTCTAT GGTACTCGGA CCATGGAGCC 1020 

ATACAGGATT TTTTTCCCAG GAAGCCATCC CGTCAGAGGG GTTCAAGTGA TGAAAGTTGG 1080 

CAAACTGCAG CTACATCAAG GCATGTTTCC CCAAGCAATG AAGAATCTGA GACTGGCTTT 1140 

TGATATTATG AGAGTGACAC ATGGCAGAGA ACACAGCCTG ATTGAAGAT7 TGATTCTACT 12 00 

TTTAGAAGAA TGCGACGCCA ACATCAGAGC ATCCTAAGGG AACGCAGTCA GAGGGAAATA 12 60 

CGGCGTGTGT CTTTGTTGAA TGCCTTATTG AGGTCACACA CTCTATGCTT TGTTAGCTGT 1320 

GTGAACCTCT CTTATTGGAA ATTCTGTTCC ' GTGTTTGTGT AGGTAAATAA AGGCAGACAT 1380 

T AGAGAAGCAC GATTATAATA AATTCAAAAC 1440 




ATTTCCTTGA GGATGCCAAA 
Seq ID NO: 421 Pr 



MRCSQCRVAK YCSAKCQKKA WPDKKRECKC LKSCKPRYPP DSVRLLGRW 

SEKLYSFYDL ESNINKLTED KKEGLRQLVM TFQHFMREEI QDASQLPPAF DLFEAFAKVI 12 0 

CNSFTICNAE MQEVGVGLYP SISLLNHSCD PNCSIVFNGP HLLLRAVRDI EVGEELTICY 180 

LDMLMTSEER RKQLRDQYCF ECDCFRCQTQ DKDADMLTGD EQVWKSVQES LKKIEELKAH 240 

WKWEQVLAMC QAIISSNSER LPDINIYQLK VLDCAMDACI NLGLLEEALF YGTRTMEPYR 300 

IFFPGSHPVR GVQVMKVGKL QLHQGMFPQA MKNLRLAFDI MRVTHGREHS LIEDLILLLE 360 
ECDANIRAS 

Seq ID NO: 422 DNA sequence 

Nucleic Acid Accession #s NM_003014.2 

Coding sequence: 238.. 648 

1 11 21 31 41 51 

I I I I I I 

GGCGGGTTCG CGCCCCGAAG GCTGAGAGCT GGCGCTGCTC GTGCCCTGTG TGCCAGACGG 60 

CGGAGCTCCG CGGCCGGACC CCGCGGCCCC GCTTTGCTGC CGACTGGAGT TTGGGGGAAG 12 0 

AAACTCTCCT GCGCCCCAGA AGATTTCTTC CTCGGCGAAG GGACAGCGAA AGATGAGGGT 18 0 

GGCAGGAAGA GAAGGCGCTT TCTGTCTGCC GGGGTCGCAG CGCGAGAGGG CAGTGCCATG 24 0 

TTCCTCTCCA TCCTAGTGGC GCTGTGCCTG TGGCTGCACC TGGCGCTGGG CGTGCGCGGC 300 

GCGCCCTGCG AGGCGGTGCG CATCCCTATG TGCCGGCACA TGCCCTGGAA CATCACGCGG 360 

ATGCCCAACC ACCTGCACCA CAGCACGCAG GAGAACGCCA TCCTGGCCAT CGAGCAGTAC 42 0 

GAGGAGCTGG TGGACGTGAA CTGCAGCGCC GTGCTGCGCT TCTTCTTCTG TGCCATGTAC 480 

GCGCCCATTT GCACCCTGGA GTTCCTGCAC GACCCTATCA AGCCGTGCAA GTCGGTGTGC 54 0 

CAACGCGCGC GCGACGACTG CGAGCCCCTC ATGAAGATGT ACAACCACAG CTGGCCCGAA 600 

AGCCTGGCCT GCGACGAGCT GCCTGTCTAT GACCGTGGCG TGTGCATTTC GCCTGAAGCC 660 

ATCGTCACGG ACCTCCCGGA GGATGTTAAG TGGATAGACA TCACACCAGA CATGATGGTA 72 0 

CAGGAAAGGC CTCTTGATGT TGACTGTAAA CGCCTAAGCC CCGATCGGTG CAAGTGTAAA 780 

AAGGTGAAGC CAACTTTGGC AACGTATCTC AGCAAAAACT ACAGCTATGT TATTCATGCC 840 

AAAATAAAAG CTGTGCAGAG GAGTGGCTGC AATGAGGTCA CAACGGTGGT GGATGTAAAA 90 0 

GAGATCTTCA AGTCCTCATC ACCCATCCCT CGAACT CAAG TCCCGCTCAT TACAAATTCT 96 0 

TCTTGCCAGT GTCCACACAT CCTGCCCCAT CAAGATGTTC TCATCATGTG TTACGAGTGG 1020 

CGTT CAAGGA TGATGCTTCT TGAAAATTGC TTAGTTGAAA AATGGAGAGA TCAGCTTAGT 1080 

AAAAGATCCA TACAGTGGGA AGAGAGGCTG CAGGAACAGC GGAGAACAGT TCAGGACAAG 1140 

AAGAAAACAG CCGGGCGCAC CAGTCGTAGT AATCCCCCCA AACCAAAGGG AAAGCCTCCT 1200 

GCTCCCAAAC CAGCCAGTCC CAAGAAGAAC ATTAAAACTA GGAGTGCCCA GAAGAGAACA 12 60 

AACCCGAAAA GAGTGTGAGC TAACTAGTTT CCAAAGCGGA GACTTCCGAC TTCCTTACAG 1320 

'GATGAGGCTG GGCATTGCCT GGGACAGCCT ATGTAAGGCC ATGTGCCCCT TGCCCTAACA 1380 
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ACTCACTGCA GTGCTCTTCA TAGACACATC TTGCAGCATT TTTCTTAAGG CTATGCTTCA 1440 

GTTTTTCTTT GTAAGCCATC ACAAGCCATA GTGGTAGGTT TGCCCTTTGG TACAGAAGGT 1500 

GAGTTAAAGC TGGTGGAAAA GGCTTATTGC ATTGCATTCA GAGTAACCTG TGTGCATACT 15S0 

CTAGAAGAGT AGGGAAAATA ATGCTTGTTA CAATTCGACC TAATATGTGC ATTGTAAAAT 162 0 

AAATGCCATA TTTCAAACAA AACACGTAAT TTTTTTACAG TATGTTTTAT TACCTTTTGA 1680 

TATCTGTTGT TGCAATGTTA GTGATGTTTT AAAATGTGAT GAAA^TATAA TGTTTTTAAG 1740 

AAGGAACAGT AGTGGAATGA ATGTTAAAAG ATCTTTATGT GTTTATGGTC TGCAGAAGGA 1800 

TTTTTGTGAT GAAAGGGGAT TTTTTGAAAA ATTAGAGAAG TAGCATATGG AAAATTATAA 1860 



3 AAAAATAAAT AAAAAGGAGA GGCAGACAAT GTCTGGATTC CTGTTTTTTG 
GTTACCTGAT TTCCATGATC ATGATGCTTC TTGTCAACAC CCTCTTAAGC AGCACCAGAA 
r TGTCTGTACC ATTAGGAGTT AGGTACTAAT TAGTTGGCTA A 
G CACAAGAGAG GTATGTCACT CATCTTACTT CCCAGGACAT C 

A CAAGCT T AAA AATGGCCTTC ATGTGAGTGC CAAATTTTGT TTTTCTTCAT 2 220 

FTGCCTA AATACATGTG AGAGGAGTTA AATATAAATG TACAGAGAGG 22 80 

AAAGTTGAGT TCCACCTCTG AAATGAGAAT TACTTGACAG TTGGGATACT TTAATCAGAA 2 3 40 

AAAAAGAACT TATTTGCAGC ATTTTATCAA CAAATTTCAT AATTGTGGAC AATTGGAGGC 2 400 

ATTTATTTTA AAAAACAATT TTATTGGCCT TTTGCTAACA CAGTAAGCAT GTATTTTATA 2 460 

AGGCATTCAA TAAATGCACA ACGCCCAAAG GAAATAAAAT CCTATCTAAT CCTACTCTCC 2520 

ACTACACAGA GGTAAT CACT ATTAGTATTT TGGCATATTA TTCTCCAGGT GTTTGCTTAT 2 5 80 

GCACTTATAA AATGATTTGA ACAAATAAAA CTAGGAACCT GTATACATGT GTTTCATAAC 2 640 

CTGCCTCCTT TGCTTGGCCC TTTATTGAGA TAAGTTTTCC TGTCAAGAAA GCAGAAACCA 2 700 

TCTCATTTCT AACAGCTGTG TCATATTCCA TAGTATGCAT TACTCAACAA ACTGTTGTGC 2 760 
TATTGGATAC TTAGGTGGTT TCTTCACTGA CAATACTGAA TAAACATCTC ACCGGAATTC 



11 21 31 41 51 

I I I I I 

3 LWLHLALGVR GAPCEAVRIP MCRHMPWNIT RMPNHLHHST QENAILAIEQ 
S AVLRPPPCAM YAPICTLEFL HDPIKPCKSV CQRARDDCEP LMKMYNHSWP 
/ YDRGVCISPE AIVTDLPEDV KWIDITPDMM VQERPLDVDC KRLSPDRCKC 
S AKIKAVQRSG CNEVTTWDV KEIFKSSSPI PRTQVPLITN 
SSCQCPKILP HQDVLIMCYE WRSRMMLLEN CLVEKWRDQL SKRSIQWSER LQEQRRTVQD 
KKKTAGRTSR SNPPKPKGKP PAPKPASPKK NIKTRSAQKR TNPKRV 

Seq ID NO: 424 DNA sequence 
Nucleic Acid Accession # : BC010423 
Coding sequence: 248.. 1780 

1 11 21 31 41 51 

I I I I I I 

C TGGGGGAGCT CGGAGCTCCC GATCACGGCT TCTTGGGGGT 
T GGGTGTGTAG AACGGGGCCG GGGCTGGGGC TGGGTCCCCT AGTGGAGACC 
CAAGTGCGAG AGGCAAGAAC TCTGCAGCTT CCTGCCTTCT GGGTCAGTTC CTTATTCAAG 
TCTGCAGCCG GCTCCCAGGG AGATCTCGGT GGAACTTCAG AAACGCTGGG CAGTCTGCCT 
TTCAACCATG CCCCTCTCCC TGGGAGCCGA GATGTGGGGG CCTGAGGCCT GGCTGCTGCT 
GCTGCTACTG CTGGCATCAT TTACAGGCCG GTGCCCCGCG GGTGAGCTGG AGACCTCAGA 
CGTGGTAACT GTGGTGCTGG GCCAGGACGC AAAACTGCCC TGCTTCTACC GAGGGGACTC 
CCGCGACCAA GTGCGGCAAG TGGCATGGGC TCGGGTGGAC GCGGGCGAAG G 
ACTAGCGCTA CTGCACTCCA AATACGGGCT TCATGTGAGC CCGGCTTA 
GGAGCAGCCG CCGCCCCCAC GCAACCCCCT GG~ " " 

GCAGGCGGAT GAGGGCGAGT ACGAGTGCCG GGTCAGCACC TTCCCCGCCG GCAGCTTCCA 660 

GGCGCGGCTG CGGCTCCGAG TGCTGGTGCC TCCCCTGCCC TCACTGAATC CTGGTCCAGC 720 

ACTAGAAGAG GGCCAGGGCC TGACCCTGGC AGCCTCCTGC ACAGCTGAGG GCAGCCCAGC 780 

CCCCAGCGTG ACCTGGGACA CGGAGGTCAA AGGCACAACG TCCAGCCGTT CCTTCAAGCA 840 

CTCCCGCTCT GCTGCCGTCA CCTCAGAGTT CCACTTGGTG CCTAGCCGCA GCATGAATGG 900 

GCAGCCACTG ACTIGTGTGG TGTCCCATCC TGGCCTGCTC CAGGACCAAA GGAT CACCCA 960 

CATCCTCCAC GTGTCCTTCC TTGCTGAGGC CTCTGTGAGG GGCCTTGAAG ACCAAAATCT 1020 

GTGGCACATT GGCAGAGAAG GAGCTATGCT CAAGTGCCTG AGTGAAGGGC AGCCCCCTCC 1080 

CT CAT AC AAC TGGACACGGC TGGATGGGCC TCTGCCCAGT GGGGTACGAG TGGATGGGGA 1140 

CACTTTGGGC TTTCCCCCAC TGACCACTGA GCACAGCGGC ATCTACGTCT GCCATGTCAG 1200 

CAATGAGTTC TCCTCAAGGG ATTCTCAGGT CACTGTGGAT GTTCTTGACC CCCAGGAAGA 1260 

CTCTGGGAAG CAGGTGGACC TAGTGTCAGC CTCGGTGGTG GTGGTGGGTG TGATCGCCGC 1320 

ACTCTTGTTC TGCCTTCTGG TGGTGGTGGT GGTGCTCATG TCCCGATACC ATCGGCGCAA 1380 

GGCCCAGCAG ATGACCCAGA AATATGAGGA GGAGCTGACC CTGACCAGGG AGAACTCCAT 1440 

CCGGAGGCTG CATTCCCATC ACACGGACCC CAGGAGCCAG CCGGAGGAGA GTGTAGGGCT 1500 

GAGAGCCGAG GGCCACCCTG ATAGTCTCAA GGACAACAGT AGCTGCTCTG TGATGAGTGA 1560 

AGAGCCCGAG GGCCGCAGTT ACTCCACGCT GACCACGGTG AGGGAGATAG AAACACAGAC 1620* 

TGAACTGCTG TCTCCAGGCT CTGGGCGGGC CGAGGAGGAG GAAGATCAGG ATGAAGGCAT 1680 

CAAACAGGCC ATGAACCATT TTGTTCAGGA GAATGGGACC CTACGGGCCA AGCCCACGGG 1740 

CAATGGCATC TACATCAATG GG CGGGGACA CCTGGTCTGA CCCAGGCCTG CCTCCCTTCC 1800 

CTAGGCCTGG CTCCTTCTGT TGACATGGGA GATTTTAGCT CATCTTGGGG GCCTCCTTAA 1860 

ACACCCCCAT TTCTTGCGGA AGATGCTCCC CATCCCACTG ACTGCTTGAC CTTTACCTCC 1920 

AACCCTTCTG TTCATCGGGA GGGCTCCACC AATTGAGTCT CTCCCACCAT GCATGCAGGT 1980 

I GTGCATGTGT GCCTGTGTGA GTGTTGACTG ACTGTGTGTG TGTGGAGGGG 2040 

3 TGGAGGGGTG ACTGTGTCCG TGGTGTGTAT TATGCTGTCA TATCAGAGTC 2100 

AAGTGAACTG TGGTGTATGT GCCACGGGAT TTGAGTGGTT GCGTGGGCAA CACTGTCAGG 2160 

T GGCTGTGTGT GACCTCTGCC TGAAAAAGCA GGTATTTTCT 2220 

A ATGATGCAGA GGTTGGAGGA GAGAGGTGGA GACTGTGGCT 2280 

A TAGCTGGAGC TGGAATCTGC CTCCGGTGTG AGGGAACCTG 2340 

TCTCCTACCA CTTCGGAGCC ATGGGGGCAA GTGTGAAGCA GCCAGTCCCT GGGTCAGCCA 2400 

GAGGCTTGAA CTGTTACAGA AGCCCTCTGC CCTCTGGTGG CCTCTGGGCC TGCTGCATGT 2460 

ACATATTTTC TGTAAATATA CATGCGCCGG GAGCTTCTTG CAGGAATACT GCTCCGAATC 2520 

ACTTTTAATT TTTTTCTTTT TTTTTTCTTG CCCTTTCCAT TAGTTGTATT TTTTATTTAT 2580 

T AGAGTTTGAG TCCAGCCTGG ACGATATAGC CAGACCCTGT 2640 
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AAAAAAAAAA 
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MPLSLGAEMW GPEAWLLLLL LLASFTGRCP AGELETSDW TW1GQDAKL PCFYRGDSGE 
QVGQVAWARV DAGEGAQELA LLHSKYGLHV SPAYEGRVEQ PPPPRNPLDG SVLLRNAVQA 
DEGEYECRVS TFPAGSPQAR LRLRVLVPPL PSLNPGPALE EGQGLTLAAS CTAEGSPAPS 
VTWDTEVKGT TSSRSFKHSR SAAVTSEPHL VPSRSMNGOP LTCWSHPGL LQDQRITKIL 
HVSFLAEASV RGLEDQNLWH IGREGAMLKC LSEGQPPPSY NWTRLEGPLP SGVRVDGDTL 
GFPPLTTEHS GIYVCHVSNE FSSRDSQVTV DVLDPQEDSG KQVDLVSASV VWGVIAALL 
FCLLWWVL HSRYHRRKAQ QMTQKYEEEL TLTRENSIRR LHSHHTDPRS QPEESVGLRA 
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1 11 21 31 41 

I I I I I 

CACTAACGCT CTTCCTAGTC CCCGGGCCAA CTCGGACAGT TTGCTCATTT A 

TCAAGGCTGG CTTGTGCCAG AACGGCGCGC GCGCGACGCA CGCACACACA CGGGGGGAAA 12 0 

CTTTTTTAAA AATGAAAGGC TAGAAGAGCT CAGCGGCGGC GCGGGCCGTG CGCGAGGGCT 18 D 

CCGGAGCTGA CTCGCCGAGG CAGGAAATCC CTCCGGTCGC GACGCCCGGC CCCGCTCGGC 240 

GCCCGCGTGG GATGGTGCAG CGCTCGCCGC CGGGCCCGAG AGCTGCTGCA CTGAAGGCCG 300 

GCGACGATGG CAGCGCGCCC GCTGCCCGTG TCCCCCGCCC GCGCCCTCCT GCTCGCCCTG 3S0 

GCCGGTGCTC TGCTCGCGCC CTGCGAGGCC CGAGGGGTGA GCTTATGGAA CGAAGGAAGA 42 0 

GCTGATGAAG TTGTCAGTGC CTCTGTTCGG AGTGGGGACC TCTGGATCCC AGTGAAGAGC 4 80 

TTCGACTCCA AGAATCATCC AGAAGTGCTG AATATTCGAC TACAACGGGA AAGCAAAGAA 540 

CTGATCATAA ATCTGGAAAG AAATGAAGGT CTCATTGCCA GCAGTTTCAC GGAAACCCAC 600 

TATCTGCAAG ACGGTACTGA TGTCTCCCTC GCTCGAAATT ACACGGTAAT TCTGGGTCAC 650 

TGTTACTACC ATGGACATGT ACGGGGATAT TCTGATTCAG CAGTCAGTCT CAGCACGTGT 720 

TCTGGTCTCA GGGGACTTAT TGTGTTTGAA AATGAAAGCT ATGTCTTAGA ACCAATGAAA 780 

AGTGCAACCA ACAGATACAA ACTCTTCCCA GCGAAGAAGC TGAAAAGCGT CCGGGGATCA 840 

TGTGGATCAC ATCACAACAC ACCAAACCTC GCTGCAAAGA ATGTGTTTCC ACCACCCTCT 900 

CAGACATGGG CAAGAAGGCA TAAAAGAGAG ACCCTCAAGG CAACTAAGTA 7GTGGAGCTG 960 

GTGATCGTGG CAGACAACCG AGAGTTTCAG AGGCAAGGAA AAGATCTGGA AAAAGTTAAG 1020 

CAGCGATTAA TAGAGATTGC TAATCACGTT GACAAGTTTT ACAGACCACT GAACATTCGG 1080 

ATCGTGTTGG TAGGCGTGCA AGTGTGGAAT GACATGGACA AATGCTCTGT AAGTCAGGAC 1140 
CCATTCACCA GCCTCCATGA ATTTCTGGAC TGGAGGAAGA TGAAGCTTCT A 



GCCCCAATCA TGAGCATGTG CACGGCAGAC C 

GACAATCCCC TTGGTGCAGC CGTGACCCTG GCACATGAGC TGGGCCACAA TTTCGGGATG 13 80 

AAT CATGAC A CACTGGACAG GGGCTGTAGC TGTCAAATGG CGGTTGAGAA AGGAGGCTGC 14 4 0 

AT CATGAACG CTTCCACCGG GTACCCATTT CCCATGGTGT TCAGCAGTTG CAGCAGGAAG 15 0 0 

GACTTGGAGA CCAGCCTGGA GAAAGGAATG GGGGTGTGCC TGTTTAACCT GCCGGAAGTC 15S0 

AGGGAGTCTT TCGGGGGCCA GAAGTGTGGG AACAGATTTG TGGAAGAAGG AGAGGAGTGT 1620 

GACTGTGGGG AGCCAGAGGA ATGTATGAAT CGCTGCTGCA ATGCCACCAC CTGTACCCTG 168 0 

AAGCCGGACG CTGTGTGCGC ACATGGGCTG TGCTGTGAAG ACTGCCAGCT GAAGCCTGCA 1740 

GGAACAGCGT GCAGGGACTC CAGCAACTCC TGTGACCTCC CAGAGTTCTG CACAGGGGCC 1800 

AGCCCTCACT GCCCAGCCAA CGTGTACCTG CACGATGGGC ACTCATGTCA GGATGTGGAC 1860 

GGCTACTGCT ACAATGGCAT CTGCCAGACT CACGAGCAGC AGTGTGTCAC ACTCTGGGGA 1920 

CCAGGTGCTA AACCTGCCCC TGGGATCTGC TTTGAGAGAG TCAATTCTGC AGGTGATCCT 1980 

TATGGCAACT GTGGCAAAGT CT CGAAGAGT TCCTTTGCCA AATGCGAGAT GAGAGATGCT 2 040 

AAATGTGGAA AAATCCAGTG TCAAGGAGGT GCCAGCCGGC CAGTCATTGG TACCAATGCC 2100 

GTTTCCATAG AAACAAACAT CCCCCTGCAG CAAGGAGGCC GGATTCTGTG CCGGGGGACC 2160 

CACGTGTACT TGGGCGATGA CATGCCGGAC CCAGGGCTTG TGCTTGCAGG CACAAAGTGT 2220 

GCAGATGGAA AAATCTGCCT GAATCGTCAA TGTCAAAATA TTAGTGTCTT TGGGGTTCAC 2230 

GAGTGTGCAA TGCAGTGCCA CGGCAGAGGG GTGTGCAACA ACAGGAAGAA CTGCCACTGC 2340 

GAGGCCCACT GGGCACCTCC CTTCTGTGAC AAGTTTGGCT TTGGAGGAAG CACAGACAGC 2 400 

GGCCCCATCC GGCAAGCAGA TAACCAAGGT TTAACCATAG GAATTCTGGT GACCATCCTG 2 460 

TGTCTTCTTG CTGCCGGATT TGTGGTTTAT CTCAAAAGGA AGACCTTGAT ACGACTGCTG 2520 

TTTACAAATA AGAAGACCAC CATTGAAAAA CTAAGGTGTG TGCGCCCTTC CCGGCCACCC 2580 

CGTGGCTTCC AACCCTGTCA GGCTCACCTC GGCCACCTTG GAAAAGGCCI GATGAGGAAG 2640 

CCGCCAGATT CCTACCCACC GAAGGACAAT CCCAGGAGAT TGCTGCAGTG TCAGAATGTT 2700 

GACATCAGCA GACCCCTCAA CGGCCTGAAT GTCCCTCAGC CCCAGTCAAC TCAGCGAGTG 2 760 

CTTCCTCCCC TCCACCGGGC CCCACGTGCA CCTAGCGTCC CTGCCAGACC CCTGCCAGCC 2 82 0 

AAGCCTGCAC TTAGGCAGGC CCAGGGGACC TGTAAGCCAA ACCCCCCTCA GAAGCCTCTG 2830 

CCTGCAGATC CTCTGGCCAG AACAACTCGG CTCACTCATG CCTTGGCCAG GACCCCAGGA 2 940 

CAATGGGAGA CTGGGCTCCG CCTGGCACCC CTCAGACCTG CTCCACAATA TCCACACCAA 3 0 00 

GTG CCCAGAT CCACCCACAC CGCCTATATT AAGTGAGAAG CCGACACCTT TTTTCAACAG 3060 

TGAAGACAGA AGTTTGCACT ATCTTTCAGC TCCAGTTGGA GTTTTTTGTA CCAACTTTTA 3120 

GGATTTTTTT TAATGTTTAA AACATCATTA CTATAAGAAC TTTGAGCTAC TGCCGTCAGT 3180 

GCTGTGCTGT GCTATGGTGC TCTGTCTACT TGCACAGGTA CTTGTAAATT ATTAATTTAT 32 40 

GCAGAATGTT GATTACAGTG CAGTGCGCTG TAGTAGGCAT TTTTACCATC ACTGAGTTTT 3300 

CCATGGCAGG AAGGCTTGTT GTGCTTTTAG TATTTTAGTG AACTTGAAAT ATCCTGCTTG 33 60 

ATGGGATTCT GGACAGGATG TGTTTGCTTT CTGATCAAGG CCTTATTGGA AAGCAGTCCC 3420 

2 CCAGCTGTGC TTATGGTACC AGATGCAGCT CAAGAGATCC CAAGTAGAAT 3480 

ZTGGATT CCCCATCTCA GGCCAGAGCC AAGGGGCTTC AGGTCCAGGC 3540 

TGTGTTTGGC TTTCAGGGAG GCCCTGTGCC CCTTGACAAC TGGCAGGCAG GCTCCCAGGG 3 600 

ACACCTGGGA GAAATCTGGC TTCTGGCCAG GAAGCTTTGG TGAGAACCTG GGTTGCAGAC 3660 

AGGAATCTTA AGGTGTAGCC ACACCAGGAT AGAGACTGGA ACACTAGACA AGCCAGAACT 372 0 

TGACCCTGAG CTGACCAGCC GTGAGCATGT TTGGAAGGGG TCTGTAGTGT CACTCAAGGC 3780 

GGTGCTTGAT AGAAATGCCA AGCACTTCTT TTTCTCGCTG TCCTTTCTAG AGCACTGCCA 3 840 
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CCAGTAGGTT ATTTAGCTTG GGAAAGGTGG TGTTTCTGTA AGAAACCTAC TGCCCAGGCA 3900 

CTGCAAACCG CCACCTCCCT ATACTGCTTG GAGCTGAGCA AATCACCACA AACTGTAATA 3960 

CAATGATCCT GTATTCAGAC AGATGAGGAC TTTCCATGGG ACCACAACTA TTTTCAGATG 4020 

TGAACCATTA ACCAGATCTA GTCAATCAAG TCTGTTTACT GCAAGGTTCA ACTTATTAAC 4080 

AATTAGGCAG ACTCTTTATG CTTGCAAAAA CTACAACCAA TGGAATGTGA TGTTCATGGG 4140 

TATAGTTCAT GTCTGCTATC ATTATTCGTA GATATTGGAC AAAGAACCTT CTCTATGGGG 4200 

CATCCTCTTT TTCCAACTTG GCTGCAGGAA TCTTTAAAAG ATGCTTTTAA CAGAGTCTGA 4260 

ACCTATTTCT TAAACACTTG CAACCTACCT GTTGAGCATC ACAGAATGTG ATAAGGAAAT 4320 

CAACTTGCTT ATCAACTTCC TAAATATTAT GAGATGTG3C TTGGGCAGCA TCCCCTTGAA 4380 

CTCTTCACTC TTCAAATGCC TGACTAGGGA GCCATGTTTC ACAAGGTCTT TAAAGTGACT 4440 

AATGGCATGA GAAATACAAA AATACTCAGA TAAGGTAAAA TGCCATGATG CCTCTGTCTT 4500 

CTGGACTGGT TTTCACATTA GAAGACAATT GACAACAGTT ACATAATTCA CTCTGAGTGT 4560 

TTTATGAGAA AGCCTTCTTT TGGGGTCAAC AGTTTTCCTA TGCTTTGAAA CAGAAAAATA 462 0 

TGTACCAAGA ATCTTGGTTT GCCTTCCAGA AAACAAAACT GCATTTCACT TTCCCGGTGT 4680 

TCCCCACTGT ATCTAGGCAA CATAGTATTC ATGACTATGG ATAAACTAAA CACGTGACAC 4740 

AAACACACAC AAAAGGGAAC CCAGCTCTAA TACATTCCAA CTCGTATAGC ATGCATCTGT 480 0 

TTATTCTATA GTTATTAAGT TCTTTAAAAT GTAAAGCCAT GCTGGAAAAT AATACTGCTG 4860 

AGATACATAC AGAATTACTG TAACTGATTA CACTTGGTAA TTGTACTAAA GCCAAACATA 492 0 

TATATACTAT TAAAAAGGTT TACAGAATTT TATGGTGCAT TACGTGGGCA TTGTCTTTTT 4980 

AGATGCCCAA ATCCTTAGAT CTGGCATGTT AGCCCTTCCT CCAATTATAA GAGGATATGA 5040 
ACCAAAAAAA AAAAAAAAAA AA 



1 11 21 31 41 51 

I I I I I 

MAARPLPVSP ARALLLALAG ALLAPCEARG VSLWNEGRAD EWSASVRSG DLWIPVKSFD 
SKNHPEVLNI RLQRESKELI INLERNEGLI ASSFTETHYL QDGTDVSLAR KYTVILGHCY 
YHGHVRGYSD SAVSLSTCSG LRGLIVPENE SYVLEPMKSA TNRYKLFPAX KLXSVRGSCG 
SHHNTPNLAA KNVFPPPSQT WARRHKRETL KATKYVELVI VADNREFQRQ GKDLEKVKQR 
LIEIANKVDK FYRPLNIRIV LVGVEVWNDM DKCSVSQDPF TSLHEFLDWR KMKLLPRKSH 
DNAQLVSGVY FQGTTIGMAP IMSMCTADQS GGIVMDHSDN PLGAAVTLAH ELGHNFGMNH 
DTLDRGCSCQ MAVEKGGCIM NASTGYPFPM VFSSCSRKDL ETSLEKGMGV CLFNLPEVRE 
SFGGQKCGNR FVEEGEECDC GEPEECMNRC CNATTCTLKP DAVCAHGLCC EDCQLKPAGT 
ACRDSSNSCD LPEFCTGASP HCPANVYLHD GHSCQDVDGY CYNGICQTHE QQCVTLWGPG 
AKPAPGICFE RVNSAGDPYG NCGKVSKSSF AKCEMRDAKC GKIQCQGGAS RPVIGTNAVS 
IETNIPLQQG GRILCRGTHV YLGDDMPDPG LVLAGTKCAD GKICLNRQCQ NISVFGVHEC 
ANQCHCRGVC NNRKNCMCEA HWAPPFCDKF GFGGSTDSGP IRQADNQGLT IGILVTILCL 
LAAGFWYLK RKTLIRLLFT NKKTTIEKLR CVRPSRPPRG FQPCQAHLGH LGKGLMRKPP 
DSYPPKDHPR RLLQCQNVDI SRPLNGLNVP QPQSTQRVLP PLHRAPRAPS VPARPLPAKP 
ALRQAQCTCK PNPPQKPLPA DPLARTTRLT HALAHTPGQW ETGLRLAPLR PAPQYPHQVP 
RSTHTAYIK 

Seq ID NO: 428 DNA sequence 



I I I I I I 

GAGGAGGAGG GAAAAGGCGA GCAAAAAGGA AGAGTGGGAG GAGGAGGGGA AGCGGCGAAG 60 

GAGGAAGAGG AGGAGGAGGA AGAGGGGAGC ACAAAGGATC CAGGTCTCCC GACGGGAGGT 12 0 

TAATACCAAG AACCATGTGT GCCGAGCGGC TGGGCCAGTT CATGACCCTG GCTTTGGTGT 18 0 

TGGCCACCTC TGACCCGGCG CGGGGGACCG ACGCCACCAA CCCACCCGAG GGTCCCCAAG 2 40 

ACAGGAGCTC CCAGCAGAAA GGCCGCCTGT CCCTGCAGAA TACAGCGGAG ATCCAGCACT 300 

GTTTGGTCAA CGCTGGCGAT GTGGGGTGTG GCGTGTTTGA ATGTTTCGAG AACAACTCTT 3 60 

GTGAGATTCG GGGCTTACAT GGGATTTGCA TGACTTTTCT GCACAACGCT GGAAAATTTG 42 0 

ATGCCCAGGG CAAGTCATTC AT CAAAGACG CCTTGAAATG TAAGGCCCAC GCTCTGCGGC 480 

ACAGGTTCGG CTGCATAAGC CGGAAGTGCC CGGCCATCAG GGAAATGGTG TCCCAGTTGC 540 

AGCGGGAATG CTACCTCAAG CACGACCTGT GCGCGGCTGC CCAGGAGAAC ACCCGGGTGA 600 

TAGTGGAGAT GATCCATTTC AAGGACTTGC TGCTGCACGA ACCCTACGTG GACCTCGTGA 660 

ACTTGCTGCT GACCTGTGGG GAGGAGGTGA AGGAGGCCAT CACCCACAGC GTGCAGGTTC 720 

AGTGTGAGCA GAACTGGGGA AGCCTGTGCT CCATCTTGAG CTTCTGCACC TCGGCCATCC 780 

AGAAGCCTCC CACGGCGCCC CCCGAGCGCC AGCCCCAGGT GGACAGAACC AAGCTCTCCA 84 0 

GGGCCCACCA CGGGGAAGCA GGACATCACC TCCCAGAGCC CAGCAGTAGG GAGACTGGCC 900 

GAGGTGCCAA GGGTGAG CG A GGTAGCAAGA GCCACCCAAA CGCCCATGCC CGAGGCAGAG 960 

TCGGGGGCCT TGGGGCTCAG GGACCTTCCG GAAGCAGCGA GTGGGAAGAC GAACAGTCTG 102 0 

AGTATTCTGA TATCCGGAGG TGAAATGAAA GGCCTGGCCA CGAAATCTTT CCTCCACGCC 1080 

GTCCATTTTC TTATCTATGG ACATT CC AAA ACATTTACCA TTAGAGAGGG GGGATGTCAC 1140 

ACGCAGGATT CTGTGGGGAC TGTGGACTTC ATCGAGGTGT GTGTTCGCGG AACGGACAGG 12 00 

TGAGATGGAG ACCCCTGGGG CCGTGGGGTC TCAGGGGTGC CTGGTGAATT CTGCACTTAC 12 60 

ACGTACTCAA GGGAGCGCGC CCGCGTTATC CTCGTACCTT TGTCTTCTTT CCATCTGTGG 1320 

AGTCAGTGGG TGTCGGCCGC TCTGTTGTGG GGGAGGTGAA CCAGGGAGGG GCAGGGCAAG 1380 

GCAGGGCCCC CAGAGCTGGG CCACACAGTG GGTGCTGGGC CTCGCCCCGA AGCTTCTGGT 1440 

GCAGCAGCCT CTGGTGCTGT CTCCGCGGAA GTCAGGGCGG CTGGATTCCA GGACAGGAGT 1500 

GAATGTAAAA ATAAAT AT CG CTTAGAATGC AGGAGAAGGG TGGAGAGGAG GCAGGGGCCG 1560 

AGGGGGTGCT TGGTGCCAAA CTGAAATTCA GTTTCTTGTG TGGGGCCTTG CGGTTCAGAG 1620 

CTCTTGGCGA GGGTGGAGGG AGGAGTGTCA TTTCTATGTG TAATTTCTGA GCCATTGTAC 1680 

TGTCTGGGCT GGGGGGGACA CTGTCCAAGG GAGTGGCCCG TATGAGTTTA TATTTTAACC 1740 

ACTGCTTCAA ATCTCGATTT CACTTTTTTT ATTTATCCAG TTATATCTAC ATATCTGTCA 1800 

TCTAAATAAA TGGCTTTCAA ACAAAGCAAC TGGGTCATTA AAACCAGCTC AAAGGGGGTT I860 

TAAAAAAAAA AAAACCAGCC CATCCTTTGA GGCTGATTTT TCTTTTTTTT AAGTTCTATT 1920 

TTAAAAGCTA TCAAACAGCG ACATAGCCAT ACATCTGACT GCCTGACATG GACTCCTGCC 1980 

CACTTGGGGG AAACCTTATA CCCAGAGGAA AATACACACC TGGGGAGTAC ATTTGACAAA 2040 

TTTCCCTTAG GATTTCGTTA TCTCACCTTG ACCCTCAGCC AAGATTGGTA AAGCTGCGTC 2100 

CTGGCGATTC CAGGAGACCC AGCTGGAAAC CTGGCTTCTC CATGTGAGGG GATGGGAAAG 2160 

GAAAGAAGAG AATGAAGACT ACTTAGTAAT TCCCATCAGG AAATGCTGAC CTTTTACATA 2220 
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AAATCAAGGA GACTGCTGAA AATCTCTAAG GGACAGGATT TTCCAGATCC TAATTGGAAA 2280 
TTTAGCAATA AGGAGAGGAG TC CAAGGGGA CAAATAAAGG CAGAGAGAGA GAGAGAGAGA 2 340 
GGGAGAGGAA GAAAAGAGAG AGAGAAAAGA GCCTCGTGCC 
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5 

10 
15 
20 



MCAERLGQFM TLALVLATFD PARGTDATNP PEGPQDRSSQ QKGRLSLQNT AEIQHCLVTJA 
GDVGCGVPEC FENNSCEIRG LHGICMTPLH NAGKFDAQGK SFIKDA1 
ISRKCPAIRE MVSQLQRECY LKHDLCAAAQ ENTRVIVEMI KFKDLLLHEP Y 

HSVQVQCEQN WGSLCSILSF CTSAIQKPPT APPERQPQVD RTKLSRAHHG 
SRETGRGAKG ERGSKSHPNA HARGRVGGLG AQGPSGSSEW EDEQSEYSDI 



3 CCCCCGATGC 



AGCCCTGCCC 
CAGCCTCAGG 
CCGACAGAAG 
GATCCTTCGG 
CCTAAAGGTA 
TGACATCATG 



TGCTGCTGCT 
ACCTCCATGC 
CACCTGCCCC 



CGACTATGAT 



GTCCGCCTTC 
TCAACACCTA 
CCAGGCTGGG 
CTGTGAGGCC 
GGGCTTTGTG 



CATTTGGTTC 
CCCCGCACCC 
GGGTCCCGAG 
CAGCACCCGG 
CTCTGAGATC 



TTCCCATGGC 
TGGAGCGATG 
ATCGACTTCG 
CTGGCCCATG 
GAGACCTGGA 
TTTGGCCACG 
TACACCTTTC 
TATGGCCAGC 
ATAGACACCA 
TCCTTTGACG 
TGGCGCCTCC 
CAGGGACTGC 



TTTCTGGCGG 
AGTTGGTGCA 
TGACGCCACT 



GGGTCCTGAC 
ATGCCCTCAG 
ATCTTTGTGG 
GGTGGGGTAC 
AGCGACTGTC 
GGGACCCGCT 
GTAGCACCAT 
TCCTTCCAGG 
TGAGCAACTG 
ATCTGTCTGC 
GTTCACAGTC 
CAACATACCT 
ATCCTCCAAA 



AAGAACAAGA 
CGTGTAGACA 
GAGGCTGCCT 
AAGTTTGACC 
TTCTTTGGCT 
GGGTGCTGAC 
CTGTGGGCAC 
AACCACCATG 
TCAGACTGGG 



GCTACCCACT 
CCTGGCCCAC 
ATGAGATTGC 
CGGTCTCCAC 
GTGGGGGCCA 
CCAGCCCTGT 
CTCAGTACTG 
TGGGCCTGGT 
TC7ACT7CTT 



GCTCCAGCCG 
CGAGAGGAGG 
TGCCACGCAG 
CGACCCATCT 
GCGCTGGGAG 
GGAGCAGGTG 
CACCTTTACT 
GCATGGGGAC 
CAAGACTCAC 
TGACCAGGGC 
GCAGCACACA 
GAGTCTCAGC 
TGTCACCTCC 
ACCGCTGGAG 
CATCCGAGGC 
GCTGCAGCCC 



3 CTCCGCAGCG CGGCCGCGCG SO 

CCGCCGCTGC TGGCCCGGGC 120 

GGGCCACAGC CCTGGCATGC 180 

GAAGCCCCCC GGCCTGCCAG 240 

GATGGGCTGA GTGCCCGCAA 300 

AAGACGGACC TCACCTACAG 360 

CGGCAGACGA TGGCAGAGGC 420 

GAGGTGCACG AGGGCCGTGC 480 



GACCTGCCGT T 



ACAGACCTGC 
ACAGCAGCCA 
CCAGATGACT 
AGGACCCCAG 
CCAGACGCCC 
GAGCTCTTTT 



AGGCCCTGAT 720 



GGCAGGACTG 
GGCTGGCACT 
GGCTGTAGGG 



TCCAGGATGC 
CTGTGAAGGT 
GTGCCGAGCC 
CCCTGCCAGG 
CAGGCATGGG 
ACAACTGCCG 
CAGGGAGGCT 
TGGCAAACCT 
GGGGAACTGG 
GAAGCAAGGG 



GAGGTTCCCG 
CCGAGGCAGG 
CCGCAG3GCC 



TTCGAGGATG 
GGTGAAAAGC 
GTCCATGCTG 
GACTACTGGC 
ACTGACTGGA 



CCCTGGGCCC 
CGCCAGATGC 
TCTTCAAAGC 
CATTGGCCTC 



CAGTCCTGGG 



GGCTGCCCTG 



GCAGGTCGTG 
TTAAGAGGAA 
TCTCATCCCT 



CCCGTCTCGT 
CATGGCTTGG 
ACCCATGGCC 
AGGGGGATGG 



TTAAGAGGAA GGGCAGTCTT 



AAATGGGGAG 
CAATCCTGTC 
GCCATTGTAA 
GAGGATTGTC 



ACAATCCTGG 
GGGTATTCTT 
CCAGGCCGGA 
ATGTGTGTAC 



TGCTGGGGCC 
TCCTGAGGTC 
AAATCTGTTC 
CATGCAGGAG 
TCCTCCTGAA 



GTTGTGAGGT 



GCCCTTTTCG 



AGGCCAAAAA 
CTGGAGGCTG 
CAGCACTGCT 



2040 
2220 



I 



I 



MAPAAWLRSA AARALLPPML LLLLQPPPLL 
PAPATQEAPR PASSLRPPRC GVPDPSDGLS 
LVQEQVRQTM AEALKVWSDV TPLTFTEVHE 
FFPKTHREGD VHFDYDETWT IGDDQGTDLL 
YPLSLSPDDC RGVQHLYGQP WPTVTSRTPA 
VSTIRGELFF FKAGFVWRLR GGQLQPGYPA 
QYWVYDGEKP VLGPAPLTEI) GLVRFPVHAA 
PVPRRATDWR GVPSEIDAAP QDADGYAYFL 
AEPANTFL 



Seq ID NO: 432 DNA sequence 
Nucleic Acid Accession #: NM_024022 
Coding sequence: 202.. 1563 



ARALPPDVHH 
ARNRQKRFVL 
GRADIMIDFA 
QVAAHEFGHV 



RYNHGDDLPF 
LGLQHTTAAK 
EIAPLEPDAP 
SPVDAAFEDA 



DGPGGILAHA 180 



RGRLYWKFDP VKVXALEGFP 



PDACEASFDA 
QGKIWFFQGA 
FHPSTRRVDS 
RLVGPDFFGC 



I 



I 



I 



C GGACGGCTCG GGTACTTTCG TTCTTAA n a TCATGC'i 3TGTGA( 
GGAAAGGGCT GTGTTTATGG GAAGCCAGTA ACACTGTGGC CTACTATCTC TTCCGTGGTG 
CCATCTACAT TTTTGGGACT CGGGAATTAT GAGGTAGAGG TGGAGGCGGA GCCGGATGTC 
AGAGGTCCTG AAATAGTCAC CATGGGGGAA AATGATCCGC CTGCTGTTGA AGCCCCCTTC 
TCATTCCGAT CGCTTTTTGG CCTTGATGAT TTGAAAATAA GTCGTGTTGC ACCAGATGCA 
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GATGCTGTTG CTGCACAGAT CCTGTCACTG CTGCCATTGA AGTTTTTTCC AATCATCGTC 360 

ATTGGGATCA TTGCATTGAT ATTAGCACTG GCCATTGGTC TGGGCATCCA CTTCGACTGC 42 0 

TCAGGGAAGT ACAGATGTCG CTCATCCTTT AAGTGTATCG AGCTGATAGC TCGATGTGAC 480 

GGAGTCTCGG ATTGCAAAGA CGGGGAGGAC GAGTACCGCT GTGTCCGGGT GGGTGGTCAG S40 

AATGCCGTGC TCCAGGTGTT CACAGCTGCT TCGTGGAAGA CCATGTGCTC CGATGACTGG 600 

AAGGGTCACT ACGCAAATGT TGCCTGTGCC CAACTGGGTT TCCCAAGCTA TGTGAGTTCA 66 0 

GATAACCTCA GAGTGAGCTC GCTGGAGGGG CAGTTCCGGG AGGAGTTTGT GTCCATCGAT 72 0 

CACCTCTTGC CAGATGACAA GGTGACTGCA TTACACCACT CAGTATATGT GAGGGAGGGA 780 

TGTGCCTCTG GCCACGTGGT TACCTTGCAG TGCACAGCCT GTGGTCATAG AAGGGGCTAC 840 

AGCTCACGCA TCGTGGGTGG AAACATGTCC TTGCTCTCGC AGTGGCCCTG GCAGGCCAGC 900 

CTTCAGTTCC AGGGCTACCA CCTGTGCGGG GGCTCTGTCA TCACGCCCCT GTGGATCATC 960 

ACTGCTGCAC ACTGTGTTTA TGACTTGTAC CTCCCCAAGT CATGGACCAT CCAGGTGGGT 1020 

CTAGTTTCCC TGTTGGACAA TCCAGCCCCA TCCCACTTGG TGGAGAAGAT TGTCTACCAC 1080 

AG CAAGT AC A AGCCAAAGAG GCTGGGCAAT GACATCGCCC TTATGAAGCT GGCCGGGCCA 1140 

CTCACGTTCA ATGAAATGAT CCAGCCTGTG TGCCTGCCCA ACTCTGAAGA GAACTTCCCC 1200 

GATGGAAAAG TGTGCTGGAC GTCAGGATGG GGGGCCACAG AGGATGGAGG TGACGCCTCC 12 60 

CCTGTCCTGA ACCACGCGGC CGTCCCTTTG ATTTCCAACA AGATCTGCAA CCACAGGGAC 1320 

GTGTACGGTG GCATCATCTC CCCCTCCATG CTCTGCGCGG GCTACCTGAC GGGTGGCGTG 13 8 0 

GACAGCTGCC AGGGGGACAG CGGGGGGCCC CTGGTGTGTC AAGAGAGGAG GCTGTGGAAG 1440 

3 CGACCAGCTT TGGCATCGGC TGCGCAGAGG TGAACAAGCC TGGGGTGTAC 1500 

T GGACTGGATC CACGAGCAGA TGGAGAGAGA CCTAAAAACC 15 60 

TGAAGAGGAA GGGGA CAAGT AGCCACCTGA GTTCCTGAGG TGATGAAGAC AGCCCGATCC 1620 

TCCCCTGGAC TCCCGTGTAG GAACCTGCAC ACGAGCAGAC ACCCTTGGAG CTCTGAGTTC 1680 

CGGCACCAGT AGCAGGCCCG AAAGAGGCAC CCTTCCATCT GATTCCAGCA CAACCTTCAA 1740 

GCTGCTTTTT GTTTTTTGTT TTTTTGAGGT GGAGTCTCGC TCTGTTGCCC AGGCTGGAGT 18 0 0 

GCAGTGGCGA AATCCCTGCT CACTGCAGCC TCCGCTTCCC TGGTTCAAGC GATTCTCTTG 1860 

CCTCAGCTTC CCCAGTAGCT GGGACCACAG GTGCCCGCCA CCACACCCAA CTAATTTTTG 192 0 

TATTTTTAGT AGAGACAGGG TTTCACCATG TTGGCCAGGC TGCTCTCAAA CCCCTGACCT 1980 

CAAATGATGT GCCTGCTTCA GCCTCCCACA GTGCTGGGAT TACAGGCATG GGCCACCACG 20 40 

CCTAGCCTCA CGCTCCTTTC TGATCTTCAC TAAGAACAAA AGAAGCAGCA ACTTGCAAGG 2100 

GCGGCCTTTC CCACTGGTCC ATCTGGTTTT CTCTCCAGGG GTCTTGCAAA ATTCCTGACG 2160 

AGATAAGCAG TTATGTGACC TCACGTGCAA AGCCACCAAC AGCCACTCAG AAAAGACGCA 222 0 

CCAGCCCAGA AGTGCAGAAC TGCAGTCACT GCACGTTTTC ATCTCTAGGG ACCAGAACCA 2280 

AACCCACCCT TTCTACTTCC AAGACTTATT TTCACATGTG GGGAGGTTAA TCTAGGAATG 2340 

ACTCGTTTAA GGCCTATTTT CATGATTTCT TTGTAGCATT TGGTGCTTGA CGTATTATTG 240 0 

TCCTTTGATT CCAAATAATA TGTTTCCTTC CCTCAAAAAA AAAAAAAAAA AAAAAAAAAA 2460 



1 11 21 31 41 SI 

I I I I I I 

MGENDPPAVE APFSFHSLFG LDDLKISPVA PDADAVAAQI LSLLPLKFFP IIVIGIIALI 60 

LALAIGLGIH FDCSGKYRCR SSFKCIELIA RCDGVSOCKD GEDEYRCVRV GGQNAVLQVF 12 0 

TAASWKTMCS DDWKGIIYANV ACAQLGFPSY VSSDNLRVSS LEGQFREEFV SIDKLLPDDK 180 

VTALHHSVYV REGCASGHW TLQCTACGHR RGYSSRIVGG NMSLLSQWPW QASLQFQGYH 2 40 

LCGGSVITPL WIITAAHCVY DLYLPKSWTI QVGLVSLLDN PAPSHLVEKI VYHSKYKPKR 30D 

LGHDIALMKL AGPLTFNEMI QPVCLPNSEE NFPDGKVCWT SGWGATEDGG DASPVLMHAA 3 60 

VPLISNKICN HRDVYGGIIS PSMLCAGYLT GGVDSCOGDS GGPLVCQERR LWKLVGATSF 420 
GIGCAEVHKP GVYTRVTSFL DWIHEQMEKD LKT 

Seq ID NO: 434 DNA sequence 

Nucleic Acid Accession ft: NM_000493.2 

Coding sequence: 97.. 2139 

1 11 21 31 41 51 

I I I I I I 

CACCTTCTGC ACTGCTCATC TGGGCAGAGG AAGCTTCAGA AAGCTGCCAA GGCACCATCT 60 

CCAGGAACTC CCAGCACGCA GAATCCATCT GAGAATATGC TGCCACAAAT ACCCTTTTTG 12 0 

CTGCTAGTAT CCTTGAACTT GGTTCATGGA GTGTTTTACG CTGAACGATA CCAAATGCCC 180 

ACAGGCATAA AAGGCCCACT ACCCAACACC AAGACACAGT TCTTCATTCC CTACACCATA 240 

AAGAGTAAAG GTATAGCAGT AAGAGGAGAG CAAGGT ACT C CTGGTCCACC AGGCCCTGCT 300 

GGACCTCGAG GGCACCCAGG TCCTTCTGGA CCACCAGGAA AACCAGGCTA CGGAAGTCCT 3 60 

GGACTCCAAG GAGAGCCAGG GTTGCCAGGA CCACCGGGAC CATCAGCTGT AGGGAAACCA 42 0 

GGTGTGCCAG GACTCCCAGG AAAACCAGGA GAGAGAGGAC CATATGGACC AAAAGGAGAT 480 

GTTGGACCAG CTGGCCTACC AGGACCCCGG GGCCCACCAG GACCACCTGG AATCCCTGGA 540 

CCGGCTGGAA TTTCTGTGCC AGGAAAACCT GGACAACAGG GACCCACAGG AGCCCCAGGA 600 

CCCAGGGGCT TTCCTGGAGA AAAGGGTGCA CCAGGAGTCC CTGGTATGAA TGGACAGAAA 660 

GGGGAAATGG GATATGGTGC TCCTGGTCGT CCAGGTGAGA GGGGTCTTCC AGGCCCTCAG 72 0 

GGTCCCACAG GACCATCTGG CCCTCCTGGA GTGGGAAAAA GAGGTGAAAA TGGGGTTCCA 78 0 

GGACAGCCAG GCATCAAAGG TGATAGAGGT TTTCCGGGAG AAATGGGACC AATTGGCCCA 84 0 

CCAGGTCCCC AAGGCCCTCC TGGGGAACGA GGGCCAGAAG GCATTGGAAA GCCAGGAGCT 900 

GCTGGAGCCC CAGGCCAGCC AGGG ATT CCA GGAACAAAAG GTCTCCCTGG GGCTCCAGGA 96 D 

ATAGCTGGGC CCCCAGGGCC' TCCTGGCTTT GGGAAACCAG GCTTGCCAGG CCTGAAGGGA 1020 

GAAAGAGGAC CTGCTGGCCT TCCTGGGGGT CCAGGTGCCA AAGGGGAACA AGGGCCAGCA 1080 

GGTCTTCCTG GGAAGCCAGG TCTGACTGGA CCCCCTGGGA ATATGGGACC CCAAGGACCA 1140 

AAAGGCATCC CGGGTAGCCA TGGTCTCCCA GGCCCTAAAG GTGAGACAGG GCCAGCTGGG 12 0 0 

CCTGCAGGAT ACCCTGGGGC TAAGGGTGAA AGGGGTTCCC CTGGGTCAGA TGGAAAACCA 12 60 

GGGTACCCAG GAAAACCAGG TCTCGATGGT CCTAAGGGTA ACCCAGGGTT ACCAGGTCCA 1320 

AAAGGTGATC CTGGAGTTGG AGGACCTCCT GGTCTCCCAG GCCCTGTGGG CCCAGCAGGA 1380 

G CAAAGGGAA TGCCCGGACA CAATGGAGAG GCTGGCCCAA GAGGTGCCCC TGGAATACCA 1440 

GGTACTAGAG GCCCTATTGG GCCACCAGGC ATTCCAGGAT TCCCTGGGTC TAAAGGGGAT 1500 

CCAGGAAGTC CCGGTCCTCC TGGCCCAGCT GGCATAGCAA CTAAGGGCCT CAATGGACCC 15 60 

ACCGGGCCAC CAGGGCCTCC AGGTCCAAGA GGCCACTCTG GAGAGCCTGG TCTTCCAGGG 1620 

CCCCCTGGGC CTCCAGGCCC ACCAGGTCAA GCAGTCATGC CTGAGGGTTT TATAAAGGCA 1680 

GGCCAAAGGC CCAGTCTTTC TGGGACCCCT CTTGTTAGTG CCAACCAGGG GGTAACAGGA 1740 
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ATGCCTGTGT CTGCTTTTAC TGTTATTCTC TCCAAAGCTT ACCCAGCAAT AGGAACTCCC 1800 

ATACCATTTG ATAAAATTTT GTATAACAGG CAACAGCATT ATGACCCAAG GACTGGAATC 1860 

TTTACTTGTC AGATACCAGG AATATACTAT TTTTCATACC ACGTGCATGT GAAAGGGACT 192 0 

CATGTTTGGG TAGGCCTGTA TAAGAATGGC ACCCCTGTAA TGTACACCTA TGATGAATAC 1980 

ACCAAAGGCT ACCTGGATCA GGCTTCAGGG AGTGCCATCA TCGATCTCAC AGAAAATGAC 2040 

CAGGTGTGGC TCCAGCTTCC CAATGCCGAG TCAAATGGCC TATACTCCTC TGAGTATGTC 210 0 

CACTCCTCTT TCTCAGGATT CCTAGTGGCT CCAATGTGAG TACA C CA ? GAGCTAATC 2160 

TAAATCTTGT GCTAGAAAAA GCATTCTCTA ACTCTACCCC ACCCTACAAA ATGCATATGG 222 0 

AGGTAGGCTG AAAAGAATGT AATTTTTATT TTCTGAAATA CAGATTTGAG CTATCAGACC 2280 

AACAAACCTT CCCCCTGAAA AGTGAGCAGC AACGTAAAAA CGTATGTGAA GCCTCTCTTG 234 0 

AATTTCTAGT TAGCAATCTT AAGGCTCTTT AAGGTTTTCT CCAATATTAA AAAATATCAC 2400 

CAAAGAAGTC CTGCTATGTT AAAAACAAAC AACAAAAAAC AAAGCAACAA AAAAAAAAAT 2460 

TAAAAAAAAA AACAGAAATA GAGCTCTAAG TTATGTGAAA TTTGATTTGA GAAACTCGGC 252 0 

ATTTCCTTTT TAAAAAAGCC TGTTTCTAAC TATGAATATG AGAACTTCTA GGAAACATCC 2580 

AGGAGGTATC ATATAACTTT GTAGAACTTA AATACTTGAA TATTCAAATT TAAAAGACAC 2640 

r AAAATATTTC TGATGGTGCA CTACTCTGAG GCCTGTATGG CCCCTTTCAT 2700 

r TCAAATATAC AGGTGCATAT ATACTTGTTA AAGCTCTTAT ATAAAAAAGC 2760 

CCCAAAATAT TGAAGTTCAT CTGAAATGCA AGGTGCTTTC ATCAATGAAC CTTTTCAAAA 2820 

CTTTTCTATG ATTGCAGAGA AGCTTTTTAT AT AC CCAGCA TAACITGGAA ACAGGTATCT 28 80 

20 GACCTATTCT TATTTAGTTA ACACAAGTGT GATTAATTTG ATTTCTTTAA TTCCTTATTG 2940 

AATCTTATGT GATATGATTT TCTGGATTTA CAGAACATTA GCACATGTAC CTTGTGCCTC 3000 

CCATTCAAGT GAAGTTATAA TTTACACTGA GGGTTTCAAA ATTCGACTAG AAGTGGAGAT 30 60 

ATATTATTTA TTTATGCACT GTACTGTATT TTTATATTGC TGTTTAAAAC TTTTAAGCTG 312 0 

TGCCTCACTT ATTAAAGCAC AAAATGTTTT ACCTACTCCT TATTTACGAC ACAATAAAAT 3180 

25 AACATCAATA GATTTTTAGG CTGAATTAAT TTGAAAGCAG CAATTTGCTG TTCTCAACCA 3240 
TTCTTTCAAG GCTTTTCATT CGACACAATA AAATAACATC AATAG 

Seq ID NO: 435 Protein sequence 
Protein Accession #: NP 000484.2 

30 ^ " 

j i 1 f r t r 

MLPQIPFLLL VSLNLVKGVF YAERYQMPTG IKGPLPNTKT QFFIEYTIKS KGIAVRGEQG 60 

TPGPPGPAGP RGHPGPSGPP GKPGYGSPGL QGEPGLPGPP GPSAVGKPGV PGLPGKPGER 120 

35 GPYGPKGDVG PAGLPGPRGP PGPPGIPGPA GISVPGKPGQ QGPTGAPGFR GFPGEKGAPG 130 

VPGMNGQKGE MGYGAPGRPG ERGLPGPQGP TGPSGPPGVG KRGENGVPGQ PGIKGDRGFP 240 

GEMGPIGPPG PQGPPGERGP EGIGKPGAAG APGQPGIPGT KGLPGAPGIA GPPGPPGFGK 300 

PGLPGLKGER GPAGLPGGPG AKGEQGPAGL PGKPGLTGPP GNMGPQGPKG IPGSHGLPGP 3 60 

KGETGPAGPA GYPGAKGERG SPGSDGKPGY PGKPGLDGPK GNPGLPGPKG DPGVGGPPGL 420 

40 PGPVGPAGAK GMPGHNGEAG PRGAPGIPGT RGPIGPPGIF GFPGSKGDPG SPGPPGPAGI 430 

ATKGLNGPTG PPGPPGPRGH SGEPGLPGPP GPPGPPGQAV MPEGFIKAGQ RFSLSGTPLV 540 

SANQGVTGMP VSAFTV1LSK AYPAIGTPIP FDKILYNRQQ HYDPRTGIFT CQIPGIYYFS 600 

YHVHVKGTHV WVGLYKNGTP VMYTYDEYTK GYLDQASGSA IIDLTENDQV WLQLPNAESN 650 

GLYSSEYVHS SFSGFLVAPM 

45 

Seq ID NO: 43 6 DNA sequence 
Nucleic Acid Accession #: XM_062811 
Coding sequence: 1..888 

50 1 11 21 31 41 51 

| | I I I I 

ATGTGGGGCG CTCGCCGCTC GTCCGTCTCC TCATCCTGGA ACGCCGCTTC GCTCCTGCAG 60 

CTGCTGCTGG CTGCGCTGCT GGCGGCGGGG GCGAGGGCCA GCGGCGAGTA CTGCCACGGC 120 

TGGCTGGACG CGCAGGGCGT CTGGCGCATC GGCTTCCAGT GTCCCGAGCG CTTCGACGGC 180 

55 GGCGACGCCA CCATCTGCTG CGGCAGCTGC GCGTTGCGCT ACTGCTGCTC CAGCGCCGAG 2 40 

GCGCGCCTGG ACCAGGGCGG CTGCGACAAT GACCGCCAGC AGGGCGCTGG CGAGCCTGGC 300 

CGGGCGGACA AAGACGGCCC CGACGGCTCG GCAGTGCCCA TCTACGTGCC GTTCCTCATT 360 

GTTGGCTCCG TGTTTGTCGC CTTTATCATC TTGGGGTCCC TGGTGGCAGC CTGTTGCTGC 420 

AGATGTCTCC GGCCTAAGCA GGATCCCCAG CAGAGCCGAG CCCCAGGGGG TAACCGCTTG 480 

60 ATGGAGACCA TCCCCATGAT CCCCAGTGCC AGCACCTCCC GGGGGTCGTC CTCACGCCAG 540 

TCCAGCACAG CTGCCAGTTC CAGCTCCAGC GCCAACTCAG GGC-CCCGC-GC GCCCCCAACA 600 

AGGTCACAGA CCAACTGTTG CTTGCCGGAA GGGACCATGA ACAACGTGTA TGTCAACATG 660 

CCCACGAATT TCTCTGTGCT GAACTGTCAG CAGGCCACCC AGATTGTGCC ACATCAAGGG 72 0 

CAGTATCTGC ATCCCCCATA CGTGGGGTAC ACGGTGCAGC ACGACTCTGT GCCCATGACA 780 
65 GCTGTGCCAC CTTTCATGGA CGGCCTGCAG CCTGGCTACA GGCAGATTCA GTCCCCCTTC 

CCTCACACCA ACAGTGAACA GAAGATGTAC CCAGCGGTGA C 

70 

1 11 21 31 41 51 

MWGARRSSVS SSWNAASLLQ LLLAAIiLAAG ARASGEYCHG WLDAQGVWRI GFQCPERFDG 
GDATICCGSC ALRYCCSSAE ARLDQGGCDN DRQQGAGEPG RADKDGPDGS AVPIYVPFLI 
75 VGSVFVAFII LGSLVAACCC RCLRPKQDPQ. QSRAPGGNRL METISMIPSA STSRGSSSRQ 
SSTAASSSSS ANSGARAPPT RSQTNCCLPE GTMKMVYVNM PTNFSVLNCQ QATQIVPHQG 
QYLHPPYVGY TVQHDSVPMT AVPPFMDGLQ PGYRQIQSPF PHTNSEQKMY PAVTV 

Seq ID NO: 43 8 DNA sequence 
80 Nucleic Acid Accession ft: NM_004004.1 
Coding sequence: 1..681 

1 11 21 31 41 51 

I I I I I 

85 ATGGATTGGG GCACGCTGCA GACGATCCTG GGGGGTGTGA ACAAACACTC CACCAGCATT 
GGAAAGATCT GGCTCACCGT CCTCTTCATT TTTCGCATTA TGATCCTCGT TGTGGCTGCA 
AAGGAGGTGT GGGGAGATGA GCAGGCCGAC TTTGTCTGCA ACACCCTGCA GCCAGGCTGC 
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AAGAACGTGT GCTACGATCA CTACTTCCCC ATCTCCCACA TCCGGCTATG GGCCCTGCAG 240 

CTGATCTTCG TGTCCAGCCC AGCGCTCCTA GTGGCCATGC ACGTGGCCTA CCGGAGACAT 300 

GAGAAGAAGA GGAAGTTCAT CAAGGGGGAG ATAAAGAGTG AATITAAGGA CATCGAGGAG 3S0 

ATCAAAACCC AGAAGGTCCG CATCGAAGGC TCCCTGTGGT GGACCTACAC AAGCAGCATC 420 

TTCTTCCGGG TCATCTTCGA AGCCGCCTTC ATGTACGTCT TCTATGTCAT GTACGACGGC 480 
TTCTCCATGC AGCGGCTGGT GAAGTGCAAC GCCTGGCCTT G 



15 
20 



MDWGTLQTIL GGVNKHSTSI GKIWLTVLPI PEIMILWAA KEVWGDSQAD FVCNTLQPGC 
KNVCYDHYFP ISHIRLWALQ LIPVSSPALL VAMHVAYRRH EKKRKFIKGE IKSEFKDIES 
IKTQKVRIEG SLWWTYTSSI FFRVIFEAAF MYVFYVMYDG FSMQRLVKCN AWPCPNTVDC 
FVSRPTEKTV FTVFMIAVSG ICILLNVTEL CYLLIRYCSG KSKKPV 



Seq ID NO: 440 DNA sequence 

Nucleic Acid Accession #: XM_061091 . 1 

Coding sequence: 1..2481 

1 11 21 31 41 51 

I I I I I 

ATGCCAAATA CTTCAGGAAC AACCAGGATT GAAATTTGGC TTCTCCAAGA GCCGCCCGGG 
CACCGAGCGC TGGTCGCCGC TCTCCTTCCG GTGAGTCCCA GCCCCGAGTT GGCTCTGGCG 
CCCGGGTACC CGCCAGTGCC GGCTGCCGAT GACCGATTCA CGCTCCCGAT GATTGGAGGT 
CAGATGCATG GTGAGAAGGT AGATCTCTGG AGCCTTGGTG TTCTTTGCTA TGAATTTTTA 
GTTGGGAAGC CTCCTTTTGA GGCAAACGAA GTCCATGTAA GCAAAGAAAC CATCGGGAAG 
ATTTCAGCTG CCAGCAAAAT GATGTGGTGC TCGGCTGCAG TGGACATCAT GTTTCTGTTA 
GATGGGTCTA ACAGCGTCGG GAAAGGGAGC TTTGAAAGGT CCAAGCACTT TGCCATCACA 
GTCTGTGACG GTCTGGACAT CAGCCCCGAG AGGGTCAGAG TGGGAGCATT CCAGTTCAGT 
TCCACTCCTC ATCTGGAATT CCCCTTGGAT TCATTTTCAA CCCAACAGGA AGTGAAGGCA 
AGAATCAAGA GGATGGTTTT CAAAGGAGGG CGCACGGAGA CGGAACTTGC TCTGAAATAC 
CTTCTGCACA GAGGGTTGCC TGGAGGCAGA AATGCTTCTG TGCCCCAGAT CCTCATCATC 
GTCACTGATG GGAAGTCCCA GGGGGATGTG GCACTGCCAT CCAAGCAGCT GAAGGAAAGG 
GGTGT CACTG TGTTTGCTGT GGGGGTCAGG TTTCCCAGGT GGGAGGAGCT GCATGCACTG 
CCCACCGACC CTAGAGGGCA GCACGTGCTG TTGGCTGAGC AGGTGGAGGA TGCCACCAAC 
GGCCTCTTCA GCACCCTCAG CAGCTCGGCC A7CTGCTCCA GCGCCACGCC AGCTGGGAGC 



CAGCCCTGCC AGAATGGAGG CACATGTGTT C 
TGCCCGCTGG CCTTTGGAGG GGAGGC1 

GTCGACCTCC TCTTCCTGCT GGACAGCTCT GCGGGCACCA CTCTGGACGG CTTCCTGCGC 
GCCAAAGTCT TCGTGAAGCG GTTTGTGCGG GCCGTGCTGA GCGAGGACTC TCGGGCCCGA 
GTGGGTGTGG CCACATACAG CAGGGAGCTG CTGGTGGCGG TGCCTGTGGG G 
,. rGTG !CTG ACCTQGTCTG GAGCCTCGAT GGCATTCCCT 
ACGGGCAGTG CCTTGCGGCA GGCGGCAGAG CGTGGCTTCG GGAGCGCCAC CAGGACAGGC 
CAGGACCGGC CACGTAGAGT GGTGGTTTTG CTCACTGAGT CACACICCGA G 

:CAG CGCGTCACGC AAGGGCGCGA GAGCTGCTCC TGCTGGGT 

GCCGTGCGGG CAGAGCTGGA GGAGATCACA GGCAGCCCAA AG CATGTGAT GGTCTACTCG 15 60 

GATCCTCAGG ATCTGTTCAA CCAAATCCCT GAGCTGCAGG GGAAGCTGTG CAGCCGGCAG 162 0 

CGGCCAGGGT GCCGGACACA AGCCCTGGAC CTCGTCTTCA TGTTGGACAC CTCTGCCTCA 1680 

GTAGGGCCCG AGAATTTTGC TCAGATGCAG AGCTTTGTGA GAAGC1GTGC CCTCCAGTTT 1740 

GAGGTGAACC CTGACGTGAC ACAGGTCGGC CTGGTGGTGT ATGGCAGCCA GGTGCAGACT 180 0 

GCCTTCGGGC TGGACACCAA ACCCACCCGG GCTGCGATGC TGCGGGCCAT TAGCCAGGCC 1860 

CCCTACCTAG GTGGGGTGGG CTCAGCCGGC ACCGCCCTGC TGCACATCTA TGACAAAGTG 192 0 

ATGACCGTCC AGAGGGGTGC CCGGCCTGGT GTCCCCAAAG CTGTGGTGGT GCTCACAGGC 1980 

GGGAGAGGCG CAGAGGATGC AGCCGTTCCT GCCCAGAAGC TGAGGAACAA TGGCATCTCT 204 0 

GTCTTGGTCG TGGGCGTGGG GCCTGTCCTA AGTGAGGGTC TGCGGAGGCT TGCAGGTCCC 2100 

CGGGATTCCC TGATCCACGT GGCAGCTTAC GCCGACCTGC GGTACCACCA GGACGTGCTC 2160 

ATTGAGTGGC TGTGTGGAGA AGCCAAGCAG CCAGTCAACC TCTGCAAACC CAGCCCGTGC 2220 

ATGAATGAGG GCAGCTGCGT CCTGCAGAAT GGGAGCTACC GCTGCAAGTG TCGGGATGGC 2280 

TGGGAGGGCC CCCACTGCGA GAACCGTGAG TGGAGCTCTT GCTCTGTATG TGTGAGCCAG 2340 

GGATGGATTC TTGAGACGCC CCTGAGGCAC ATGGCTCCCG TGCAGGAGGG CAGCAGCOGT 2400 

ACCCCTCCCA GCAACTACAG AGAAGGCCTG GGCACTGAAA TGGTGCCTAC CTTCTGGAAT 2460 
GTCTGTGCCC CAGGTCCTTA G 

Seq ID NO: 441 Protein sequence 
Protein Accession ft: XP_061091.1 

1 11 21 31 41 51 

I I I I I I 

MPNTSGTTRI EIWLLQEPPG HRALVAALLP VSPSPELALA FGYPPVPAAD DRFTLPMIGG 60 

QMHGEKVDLW SLGVLCYEFL VGKPPFEANE VHVSKETIGK ISAASKMMWC SAAVDIMFLL 120 

DGSNSVGKGS FERSKHFAIT VCDGLDISPE RVRVGAFQFS STPHLEFPLD SFSTQQEVKA 180 

RIKRMVFKGG RTETELALKY LLHRGLPGGR NASVPQILII VTDGKSQGDV ALPSKQLKER 240 

FPRWEELHAL ASEPRGQHVL LAEQVEDATN GLFSTLSSSA ICSSATPAGS 300 

3ISLIGPCDS QPCQNGGTCV PEGLDGYQCL CPLAFGGEAK CALKLSLECR 360 

S AGTTLDGFLR AKVFVKRFVR AVLSEDSRAR VGVATYSREL LVAVPVGEYQ 420 

DVPDLVWSLD GIPFRGGPTL TGSALRQAAE RGFGSATRTG QDRPRRVWL LTESHSEDEV 4 80 

AGPARHARAR ELLLLGVGSE AVRAELEEIT GSPKHVMVYS DPQDLFNQIP ELQGKLCSRQ 540 

RPGCRTQALD LVFMLDTSAS VGPENFAQMQ SFVRSCALQF EVNPDVTQVG LWYGSQVQT 600 

AFGLDTKPTR AAMLRAISQA PYLGGVGSAG TALLHIYDKV MTVORGASFG VPKAVWLTG 6 60 

GRGAEDAAVP AQKLRNNGIS VLWGVGPVL SEGLRRLAGP RDSLIHVAAY ADLRYHQDVL 720 
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IEWLCGEAKQ PVNLCKPSPC MNEGSCVLQN GSYRCKCRDG WEGPHCENRE KSSCSVCVSQ 
GWILETPLRH MAPVQEGSSR TPPSNYREGL GTEMVPTFWN VCAPGP 

Seq ID NO: 442 DNA 
Nucleic Acid Accession 
Coding sequence: 1..242 



PCT/US02/12476 



AGCGTCGGGA 
CTGGACATCA 
CTGGAATTCC 
ATGGTTTTCA 
GGGTTGCCTG 
AAGTCCCAGG 
TTTGCTGTGG 
AGAGGGCAGC 
ACCCTCAGCA 



TCCTGTTGCT GGAGGCCGTC 
TCCAGGAAGT CCATGTAAGC AAAGAAACCA TCGGGAAGAT 
GGCTGCAGTG GACATCATGT TTCTGTTAGA 



TTCAGCTGCC 



GCCCCGAGAG 
CCTTGGATTC 
AAGGAGGGCG 



GGTCAGAGTG 



GGGATGTGGC 
ACGTGCTGTT 



TGCTTCTGTG 
ACTGCCATCC 
TCCCAGGTGG 
GGCTGAGCAG 
CTGCTCCAGC 



GGAGCATTCC 
CAACAGGAAG 
GAACTTGCTC 



AGTTCAGTTC C 



AAGCAGCTGA 
GAGGAGCTGC 
GTGGAGGATG 



TCATCATCGT 
AGGAAAGGGG 
ATGCACTGGC 



AGAGTGTTCC 
TCGCAGCCCT 
CTCTGCCCGC 
AGGGTCGACC 



GCTGGACAGC 



ACAGGACGCT 
GGCGGACCCT 
TAACCCACCC 
GCCAGAATGG 
TGGCCTTTGG 
TCCTCTTCCT 
TCTTCGTGAA 
TGGCCACATA 
CTGACCTGGT 
GTGCCTTGCG 
GGCCACGTAG 
CAGCGCGTCA 
GAGGCCGTGC GGGCAGAGCT GGAGGAGATC 



GCTGCACACT 
TACAGGACCA 
GTTCCAGAAG 
AACTGTGCCC 
TCTGCGGGCA 



ACTGCAGGGT 
CTGGCAATGC 
GTCCCTTCTA 
CCTGCCCAGG 
GACTGGACGG 



TCTGCACAGA 
CACTGATGGG 
TGTCACTGTG 
CAGCGAGCCT 
CCTCTTCAGC 
CGAGGCTCAC 
CCCATGCTGG 
CAGCTGGAAG 



CAGGATGTGC 
CTGACGGGCA 
GGCCAGGACC 



TCAGTAGGGC 
TTTGAGGTGA 
ACTGCCTTCG 
GCCCCCTACC 
GTGATGACCG 



ACCCTGACGT 



TTGCTCACTG 
CGAGAGCTGC 
ACAGGCAGCC 
CCTGAGCTGC 
GACCTCGTCT 
CAGAGCTTTG 
GGCCTGGTGG 



CGGTGCCTGT 
CCTTCCGTGG 
TCGGGAGCGC 
AGTCACACTC 
TCCTGCTGGG 
CAAAGCATGT 
AGGGGAAGCT 
TCATGTTGGA 
TGAGAAGCTG 



CTACCAGTGC 
CCTGGAATGC 
CGGCTTCCTG 
CTCTCGGGCC 
GGGGGAGTAC 
TGGCCCCACC 
CACCAGGACA 



TCTCTCTTGG 
CCCCGGGATT 
CTCATIGAGT 



GGCTGGGAGG 



TAGGTGGGGT 
TCCAGAGGGG 
GCGCAGAGGA 
TCGTGGGCGT 
CCCTGATCCA 
GGCTGTGTGG 
AGGGCAGCTG 
GCCCCCACTG 
TTCTTGAGAC 
CCAGCAACTA 



GGGCTCAGCC 



TGTAGGCAGT 
GATGGTCTAC 
GTGCAGCCGG 
CACCTCTGCC 
TGCCCTCCAG 
CCAGGTGCAG 
CATTAGCCAG 



TGCAGCCGTT 
GGGGCCTGTC 
CGTGGCAGCT 
AGAAGCCAAG 
CGTCCTGCAG 
CGAGAACCGT 
GCCCCTGAGG 
CAGAGAAGGC 



\ AGCTGAGGAA C 



TACGCCGACC TGCGGTACCA 
CAGCCAGTCA ACCTCTGCAA 
AATGGGAGCT ACCGCTGCAA 
GAGTGGAGCT CTTGCTCTGT 
CACATGGCTC CCGTGCAGGA 
CTGGGCACTG AAATGGTGCC 



CCAGGACGTG 
ACCCAGCCCG 
GTGTCGGGAT 
ATGTGTGAGC 
GGGCAGCAGC 
TACCTTCTGG 



MPPFLLLEAV 



MVFKGGRTET 
FAVGVRFPRW 
PCEHRTLEMV 
SQPCQNGGTC 



LTGSALRQAA 
EAVRAELEEI 
SVGPENFAQM 
APYLGGVGSA 
SVLWGVGPV 
CMNEGSCVLQ 
RTPPSNYREG 



I 

CVFLFSRVPP 
KHFAITVCDG 
ELALKYLLHR 
EELHALASEP 
REFAGNAPCW 
VPEGLDGYQC 
RAVLSEDSRA 
ERGFGSATRT 



SLPLQEVHVS 



GLPGGRNASV 
RGQHVLLAEQ 
RGSRRTLAVIi 
LCPLAFGGEA 
RVGVATYSRE 
GQDRFRRVW 
SDPQDLFNQI 
FEVNPDVTQV 
VMTVQRGARP 
PRDSLIHVAA 



I 

" KETIGKISAA 
GAFQFSSTPH 
PQILIIVTDG 
VEDATNGLFS 
AAHCPFYSWK 
NCALKLSLEC 
LLVAVPVGEY 



SKMMWCSAAV 



RVFLTHPATC 



PELQGKLCSR 
GLWYGSQVQ 
GVPKAVWLT 
YADLRYHQDV 



LGTEMVPTFW NVCAPGP 



QDVPDLVWSL 
VAGPARHARA 
QRFGCRTQAL 
TAFGLDTKPT 
GGRGAEDAAV 
LIEWLCGEAK 
QGWILETPLR 



DIMFLLDGSN 
QQEVKARIKR 
KQLKERGVTV 
ATPDCRVEAH 
YRTTCPGPCD 
SAGTTLDGFL 
DGIPFRGGPT 
RELLLLGVGS 
DLVFMLDTSA 
RAAMLRAISQ 
PAQKLRNNGI 
QPVNLCKPSP 



Seq ID NO: 444 
Nucleic Acid Accession #: Eos sequence 
Coding sequence: 39.. 2356 



I 



I 



3 CCCGGGTCTG TGAGTAGAGC CGCCCGGGCA CCGAGCGCTG 
GTCGCCGCTC TCCTTCCGTT ATATCAACAT GCCCCCTTTC CTGTTGCTGG AAGCCGTCTG 
TGTTTTCCTG TTTTCCAGAG TGCCCCCATC TCTCCCTCTC CAGGAAGTCC ATGTAAGCAA 
AGAAACCATC GGGAAGATTT CAGCTGCCAG CAAAATGATG TGGTGCTCGG CTGCAGTGGA 
CATCATGTTT CTGTTAGATG GGTCTAACAG CGTCGGGAAA GGGAGCTTTG AAAGGTCCAA 
GCACTTTGCC ATCACAGTCT GTGACGGTCT GGACATCAGC CCCGAGAGGG TCAGAGTGGG 
AGCATTCCAG TTCAGTTCCA CTCCTCATCT GGAATTCCCC TTGGATTCAT TTTCAACCCA 
ACAGGAAGTG AAGGCAAGAA TCAAGAGGAT GGTTTTCAAA GGAGGGCGCA CGGAGACGGA 
ACTTGCTCTG AAATACCTTC TGCACAGAGG GTTGCCTGGA GGCAGAAATG CTTCTGTGCC 
CCAGATCCTC ATCATCGTCA CTGATGGGAA GTCCCAGGGG GATGTGGCAC TGCCATCCAA 



WO 02/086443 

GCAGCTGAAG GAAAGGGGTG TCACTGTGTT T 
GGAGCTGCAT GCACTGGCCA GCGAGCCTAG A 1 



CACGCCAGAC 



GTCAGGTTTC 
GTGCTGTTGG 
TCGGCCATCT 



CTGAGCAGGT 



TGCACACTGT CCCTTCTACA GCTGGAAGAG 
CAGGACCACC TGCCCAGGCC CCTGTGACTC 
TCCAGAAGGA CTGGACGGCT ACCAGTGCCT 
G AAGCTGAGCC TGGAATGCAG 



GCAGCCCTGC 
CTGCCCGCTG 
GGTCGACCTC 



CGG C GTT j 

CAGAATGGAG 
GCCTT7GGAG 
CTCTTCCTGC 



CGG7GCTGGC 
CCACCTGCTA 
GCACATGTGT 



PCT/US02/12476 



15 
20 



TGAGCTGCAG 

GAGCTTTGTG 
CCTGGTGGTG 
GGCTGCGATG 
CACCGCCCTG 
TGTCCCCAAA 
TGCCCAGAAG 
AAGTGAGGGT 
CGCCGACCTG 
GCCAGTCAAC 
TCGGAGCTAC 
CTTGAGACGC 



GGGAAGCTGT 
AGAAGCTGTG 



TGGTCTACTC GGATCCTCAG GATCTGTTCA 
GCAGCCGGCA GCGGCCAGGG TGCCGGACAC 
AGTAGGGCCC GAGAATTTTG 
CCCTCCAGTT TGAGGTGAAC CCTGACGTGA 
AGGTGCAGAC TGCCTTCGGG CTGGACACCA 



CTGCACATCT ATGACAAAGT 



CCAGGTCCTT 



CTGAGGAACA 
CTGCGGAGGC 
CGGTACCACC 
CTCTGCAAAC 
CGCTGCAAGT 
CCCTGAGGCA 
GAGAAGGCCT 



ATGGCATCTC 
TTGCAGGTCC 
AGGACGTGCT 



AAACGATGTT 
GACTTAAATT 



CAACTGCAGC 
GTTGAAAAGT 
GGCTATGTCA 
TAGCGGCCTG 
ATGCCCAGCA 



GGGCACTGAA 
CTTCCCGCCG 
CATGCTGCTT 
TTTGATGTGT 
TCTGCCACCT 
ACGTTCCTT' 



GATGACCGTC 
CGGGAGAGGC 
TGTCTTGGTC 
CCGGGATTCC 
CATTGAGTGG 
CATGAATGAG 
CTGGGAGGGC 
GTGCAGGAGG 
ATGGTGCCTA 



ACCAAATCCC 
AAGCCCTGGA 
CTCAGATGCA 
CACAGGTCGG 
AACCCACCCG 
GCTCAGCCGG 



GCAGAGGATG C 



CTGTGTGGAG 



GAGGCCTT1 



AGAGACAAGA 
AAGTAAATAC 
TTCCCTTGAG 
!AATC 



CTAGAGCATC 



GCAGCAGCCG 
CCTTCTGGAA 
CACTA7TCTC 
AAGCAGCTGA 
CCACTTTCTG 
GATAAACAAG 
AATGCTCGCC 
CTTTGGACGG 



AAGCCAAGCA 
"TCCTGCAGAA 
AGAACCGATT 
TACCCCTCCC 
TGTCTGTGCC 
ACTGAGGGAG 
TGTCACCCAC 
TACCTGCTGT 



AGAATGTTGT 



2400 
2460 
2520 
2580 
2S40 
2700 



1 11 21 31 

I I I I 

MPPFLLLEAV CVFLFSRVPP SLPLQEVHVS KETIGKISAA 

V GAFQFSSTPH 

V PQILIIVTDG 

2 VEDATNGLFS 
L AAHCPFYSWK 
k NCALKLSLEC 

3 LLVAVPVGEY 



SQPCQNGGTC 
RAKVFVKRFV 
LTGSALRQAA 
EAVRAELEEI 
SVGPENFAQM 
APYLGGVGSA 
SVLWGVGPV 
CMNEGSCVLQ 



RGQHVLLAEO V 
LCPLAFGGEA N 



ERGFGSATRT 
TGSPKKVMVY 
QSFVRSCALQ 
GTALLHIYDK 



GQDRPRRWV 
SDPQDLFNQI 
FEVNPDVTQV 



I 

SKMNWCSAAV 
LEFPLDSFST 
KSQGDVALPS 
TLSSSAICSS 
RVFLTHPATC 
RVDLLFLLDS 
QDVFDLVWSL 



NGSYRCKCRD G 



V LIEWLCGEAK 



YRTTCPGPCD 
SAGTTLDGFL 
DGIPFRGGPT 
RELLLLGVGS 
DLVFMLDTSA 
RAAMLRAISQ 
PAQKLRNNGI 
QPVNLCKPSP 



Seq ID NO: 446 
Nucleic Acid Ac 
Coding sequence: 



TGCTCCTCCT G 



I I I 

G GCGCGCCCAG CCTGCCAGCC GCGCTGCTGC 



GTAAAGAAGA 
TCCTCTGATG 
TCAGTTCGGG 
GCGATGAAGT 
CAGCCCTCAG 



ATGTCTGAAT 
GACTCACAAT 
CCTGAACGGA 
GCTCTACCCA 
ACCGTGGATG 
GTGACCCTTC 
GTCTGCAGCA 



ACTTAAAGAA 
ACAGTTGTGA 
AAGGCTGTAG 
TTCCAGCGCG 
AGAATTCTGT 
AGAAAAGGGC 
TAGAAAGCTT 
CAAGGAGAC C 
GAGCTCGTCC 



GTGAAGTTGA 
TCTGATAATT 
CAGTGCAGGC 
GGAGCAACCA 
AACTCCGATT 
AAGCAAAACA 



CGCTCTCCCC 
TGCCGCAGAA 
TTTCCATGGA 
TTGCAAACAC 



ACAAAAAAGC 



A TGAAGATGAC 



CGAGGCCAGT 

CGGCAGCGAG 

TATCTGGAAA 
TTTTTTCACT 
AATCAAGTTA 



GAAGATATAT 
TACCAAAACA 
CTGCCTTCGA 
TTGCCCGCCT 



ACATTCCCGG 
TCAAGGTCCC 
GATAAGTACA 
CTGCCCAGAA 
GAAGAAATTA 
AACCGTTCAC 
AACTGCAGAA 



AAGCAATGCT 
GACATCCCCT 
G7GTTGCTTC 
GGATCCTCGG 



GCTCCAAGCG 
AGATCTCAGA 
AACCTCGTCA 
GAGGCTGCAG 
TCTCAGGGTG 
AGAGTCCCGC 
AAGTGGAATG 
TGCAAAACTC 
CCCAGGCTCC 
CAGGAGAAAC 
GTCCCTTGAC 



TGTTGGTGAG AAAGAGGAAG 780 



TGTCGAGGAA 



CAGATCATCC 
CAGAGGAGGA GTTGGAGAAC 
TGGGCTCTAC TTGTCATCAA 
ACCCAGACTG CTGGGGCGTT 
GTGAAGAGGT CAGGGATGCT 
TCTGCAACTG CAGTTTCTGC 



TGCATGCCTA CTTGAAAAGC CTGAAACAGG AATTTGAAAT GCAAGCATAA 
ATTTGCTGCC TGCCTTCTAC TTCTCAAATC TTTCTTGTAA AAGTTTCCAA 
GAAACCTGAG TTAAAAATCT TGATGATCAG CCTGTTTCAT AAGAAACTCC 
ATCTTAGCAG ACATGTGTTT CTGGAGCATC ACAGAAGGTA TATTGCTAGT 



WO 02/086443 PCT/US02/12476 

TACACTTTGC CCTCCTGCAG TTTCTTCTCT GCTCCCAACC CCCATCTCAT AGCATCCCCC 1500 

TCTATTTCCA ATGCTCCTCT CCAACCGCTT AGTTTCTGAA TTTCTTTTAA ATTACAGTTT 1560 

TATGAAAGCA TATTTTATTT ACTTGGTGTT GAAATAGCCC TCATAAAACC TAAGCACTTG 1S20 

GAAACACAAT AATAGTATTA ACTAACTAGA TCTAT1GAAT TTCAGAGAAG AGCCTTCTAA 1680 

CTTGTTTACA CAAAAACGAG TATGATTTAG CACTCATACT AGTTGAAATT TTTAATAGAA 1740 

TCAAGGCACA AAAGTCTTAA AACCATGTGG AAAAATTAGG TAATTATTGC AGATTGATGT 1S00 

CTCTCAATCC CATGTATTGC GCTTATGTTA CAAGTTGTTG TCACAGTTGA GAC7TAATTT 1BS0 

CTCCTAATTT CTTCTGCCCG AAGGGTAAGT GGTGCGTCCA GCTTACACGA TCA7AATTCA 1920 

AAGGTTGGTG GGCAATGTAA TACTTAATTA AAATAATGAT GGAAGAGCTA TCTGGAGATT 1980 

ATGAGTAAGC TGATTTGAAT TTTCAGTATA AAACTTTAGT ATAATTGTAG TTTGCAAAGT 2040 

TTATTTCAGT TCACATGTAA GGTATTGCAA ATAAATTCTT GGACAATTTT GTA7GGAAAC 2100 

TTGATATTAA AAACTAGTCT GTGGTTCTTT GCAGTTTCTT GTAAATTTAT AAACCAGGCA 2160 

CAAGGTTCAA GTTTAGATTT TAAGCACTTT TATAACAATG ATAAGTGCCT TTT7GGAGAT 2220 

GTAACTTTTA GCAGTTTGTT AACCTGACAT CTCTGCCAGT C7AGTTTCTG GGCAGGTTTC 2280 

CTGTGTCAGT ATTCCCCCTC CTCTTTGCAT TAATCAAGGT ATTTGGTAGA GGTGGAATCT 2340 

AAGTGTTTGT ATGTCCAATT TACTTGCATA TGTAAACCAT TGCTG7GCCA TTCAATGTTT 2400 

GATGCATAAT TGGACCTTGA ATCGATAAGT GTAAATACAG CTTTTGATC? GTAATGCTTT 2 460 
TATACAAAAG TTTATTTTAA TAATAAAATG TTTGTTCTAA AAAAAAAAAA 



1 11 21 31 41 51 

I I I I I I 

MDARRVPQKD LRVKKNLKKF RYVKLISMET SSSSDDSCDS FASDNFANTR LQSVREGCRT 60 

RSQCRHSGPL RVAMKFPARS TRGATNKKAE SRQPSENSVT DSNSDS3DES GMNFLEKRAL 12 0 

NIKQNKAMLA KLMSELESFP GSFRGRHPLP GSDSQSRRPR RRTFPGVASR RNPERRARPL 180 

TRSRSRILGS LDALPMEEEE EEDKYMLVRK RKTVDGYMNE DDLPRSRRSR SSVTLPHIIR 240 

PVEEITEEEL ENVCSNSREK IYNRSLGSTC HQCRQKTIDT KTNCRNPDCW GVRGQFCGPC 300 

LRNRYGEEVR DALLDPNWHC PPCRGICNCS FCRQRDGRCA TGVLVYLAKY HGFGNVHAYL 360 
KSLKQEFEMQ A 

Seq ID NO: 448 DNA sequence 

Nucleic Acid Accession ft: NM_019894 

Coding sequence: 1..1314 

1 11 21 31 41 51 

I I I I I 

ATGTTACAGG ATCCTGACAG TGATCAACCT CTGAACAGCC 7CGATGTCAA ACCCCTGCGC 60 

AAACCCCGTA TCCCCATGGA GACCTTCAGA AAGGTGGGGA 7CCCCATCAT CA7AGCACTA 12 0 

CTGAGCCTGG CGAGTATCAT CATTGTGGTT GTCCTCATCA AGGTGATTCT GGATAAATAC 180 

TACTTCCTCT GCGGGCAGCC TCTCCACTTC ATCCCGAGGA AGCAGCTGTG TGACGGAGAG 24 0 

CTGGACTGTC CCTTGGGGGA GGACGAGGAG CACTGTGTCA AGAGCTTCCC CGAAGGGCCT 3 00 

GCAGTGGCAG TCCGCCTCTC CAAGGACCGA TCCACACTGC AGGTGCTGGA CTCGGCCACA 3 60 

GGGAACTGGT TCTCTGCCTG TTTCGACAAC TTCACAGAAG CTCTCGCTGA GACAGCCTGT 42 0 

AGGCAGATGG GCTACAGCAG CAAACCCACT TTCAGAGCTG TGGAGATTGG CCCAGACCAG 4 80 

GATCTGGATG TTGTTGAAAT CACAGAAAAC AGCCAGGAGC TTCGCATGCG GAACTCAAGT 540 

GGGCCCTGTC TCTCAGGCTC CCTGGTCTCC CTGCAC7GTC TTGCCTGTGG GAAGAGCCTG 600 

AAGACCCCCC GTGTGGTGGG TGGGGAGGAG GCCTCTGTGG ATTCTTGGCC TTGGCAGGTC 660 

AGCATCCAGT ACGACAAACA GCACGTCTGT GGAGGGAGCA TCCTGGACCC CCACTGGGTC 720 

CTCACGGCAG CCCACTGCTT CAGGAAACAT ACCGATGTGT TCAACTGGAA GGTGCGGGCA 780 

GGCTCAGACA AACTGGGCAG CTTCCCATCC CTGGCTGTGG CCAAGATCAT CATCATTGAA 840 

TTCAACCCCA TGTACCCCAA AGACAATGAC ATCGCCCTCA TGAAGCTGCA GTTCCCACTC 900 

ACTTTCTCAG GCACAGTCAG GCCCATCTGT CTGCCCTTCT TTOATGAGGA GCTCACTCCA 960 

GCCACCCCAC TCTGGATCAT TGGATGGGGC TTTACGAAGC AGAATGGAGG GAAGATGTCT 1020 

GACATACTGC TGCAGGCGTC AGTCCAGGTC ATTGACAGCA CACGGTGCAA TGCAGACGAT 1080 

GCGTACCAGG GGGAAGTCAC CGAGAAGATG ATGTGTGCAG GCATCCCGGA AGGGGGTGTG 1140 

GACACCTGCC AGGGTGACAG TGGTGGGCCC CTGATGTACC AATCTGACCA GTGGCATGTG 12 00 

GTGGGCATCG TTAGCTGGGG CTATGGCTGC GGGGGCCCGA GCACCCCAGG AG TATACACC 1260 
AAGGTCTCAG CCTATCTCAA CTGGATCTAC AATGTCTGGA AGGCTGAGCT GTAA 

Seq ID NO: 449 Protein sequence 



1 11 21 31 41 51 

I i I I I I 

MLQDPDSDQP LNSLDVKPLR KPRIPMETFR KVGIPIIIAL LSLASJI'"" 
YFLCGQPLHF IPRKQLCDGE LDCPLGEDEE KCVKSFPEGP AVAVRLSKDR STLQVLDSAT 
GNWFSACFDN FTEALAETAC RQMGYSSKPT FRAVEIGPDQ DLDW3ITEN SQELRMRNSS 
GPCLSGSLVS LHCLACGKSL KTPRWGGEE ASVDSWPWQV SIQYDXQHVC GGSILDPHWV 
LTAAHCFRKH TDVFNWKVRA GSDKLGSFPS LAVAKIIIIE FNEMYPKEND IALMKLQFPL 
TFSGTVRPIC LPFFDEELTP ATPLWIIGWG FTKQNGGKMS DILLQASVQV IDSTRCNADD 
AYQGEVTEKM MCAGIPEGGV DTCQGDSGGP LMYQSDQWH\' VGIVSWGYGC GGPSTPGVYT 

Seq ID NO: 450 DNA sequence 

Nucleic Acid Accession j(: XM_051860.2 

Coding sequence: 52.. 3042 



GCTCACCCAG GAAAAATATG CAATCGTCCC A 
GTTAACCTCA GCACCGAGGT TGTCTACAAA AAAGGCCAGG ATTATAGGTT TGCTTGCTAC 
GACCGGGGCA GAGCCTGCCG GAGCTACCGT GTAGGGTTCC TCTGTGGGAA GCCTGTGAGG 
CCCAAACTCA CAGTCACCAT TGACACCAAT GTGAACAGCA CCATTCTGAA CTTGGAGGA7 
AATGTACAGT CATGGAAACC TGGAGATACC CTGGTCATTG CCAGTACTGA TTACTCCATG 
TACCAGGCAG AAGAGTTCCA GGTGCTTCCC TGCAGATCC7 GCGCCCCCAA CCAGGTCAAA 



353 
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GTGGCAGGGA AACCAATGTA CCTGCACATC GGGGAGGAGA TAGACGGCGT GGACATGCGG 420 

GCGGAGGTTG GGCTTCTGAG CCGGAACATC ATAGTGATGG GGGAGATGGA GGACAAATGC 480 

TACCCCTACA GAAACCACAT CTGCAATTTC TTTGACTTCG ATACCTTTGG GGGCCACATC 540 

AAGTTTGCTC TGGGATTTAA GGCAGCACAC TTGGAGGGCA CGGAGCTGAA GCATATGGGA 600 

CAGCAGCTGG TGGGTCAGTA CCCGATTCAC TTCCACCTGG CCGGTGATGT AGACGAAAGG 660 

GGAGGTTATG ACCCACCCAC ATACATCAGG GACCTCTCCA TCCATCATAC ATTCTCTCGC 720 

TGCGTCACAG TCCATGGCTC CAATGGCTTG TTGATCAAGG ACGTTGTGGG CTATAACTCT 780 

TTGGGCCACT GCTTCTTCAC GGAAGATGGG CCGGAGGAAC GCAACACTTT TGACCACTGT 840 

CTTGGCCTCC TTGTCAAGTC TGGAACCCTC CTCCCCTCGG ACCGTGACAG CAAGATGTGC 900 

AAGATGATCA CAGGAGACTC CTACCCAGGG TACATCCCCA AGCCCAGGCA AGACTGCAAT 960 

GCTGTGTCCA CCTTCTGGAT GGCCAATCCC AACAACAACC TCATCAACTG TGCCGCTGCA 1020 

GGAATGTACT CCCCAGGTTA TTCAGAGCAC ATTCCACTGG GAAAATTCTA TAACAACCGA 1140 

GCACATTCCA ACTACCGGGC TGGCATGATC ATAGACAACG GAGTCAAAAC CACCGAGGCC 1200 

TCTGCCAAGG ACAAGCGGCC GTTCCTCTCA ATCATCTC7G CCAGATACAG CCCTCACCAG 1260 

GACGCCGACC CGCTGAAGCC CCGGGAGCCG GCCATCATCA GACACTTCAT TGCCTACAAG 13 20 

GCTGACAATG GCATTGGCCT GACCCTGGCC AGTGGTGGAA CCTTCCCGTA TGACGACGGC 1440 



ATGATGGACA ATAGGATCTG GGGCCCTGGC GGCTTGGACC ATAGCGGAAG GACCCTCCCT 1560 

ATAGGCCAGA ATTTTCCAAT TAGAGGAATT CAGTTATATG ATGGCCCCAT CAACATCCAA 1620 

AACTGCACTT TCCGAAAGTT TGTGGCCCTG GAGGGCCGGC ACACCAGCGC CCTGGCCTTC 1680 

CGCCTGAATA ATGCCTGGCA GAGCTGCCCC CATAACAACG TGACCGGCAT TGCCTTTGAG 1740 

GACGTTCCGA TTACTTCCAG AGTGTTCTTC GGAGAGCCTG GGCCCTGGTT CAACCAGCTG 1800 

GACATGGATG GGGATAAGAC ATCTGTGTTC CATGACGTCG ACGGCTCCGT GTCCGAGTAC 1860 

CCTGGCTCCT ACCTCACGAA GAATGACAAC TGGCTGGTCC GGCACCCAGA CTGCATCAAT 1920 

GTTCCCGACT GGAGAGGGGC CATTTGCAGT GGGTGCTATG CACAGATGTA CATTCAAGCC 1980 

TACAAGACCA GTAACCTGCG AATGAAGATC ATCAAGAATG ACTTCCCCAG CCACCCTCTT 2 040 

TACCTGGAGG GGGCGCTCAC CAGGAGCACC CATTACCAGC AATACCAACC GGTTGTCACC 2100 

CTGCAGAAGG GCTACACCAT CCACTGGGAC CAGACGGCCC CCGCCGAACT CGCCATCTGG 2160 

CTCATCAACT TCAACAAGGG CGACTGGATC CGAGTGGGGC TCTGCTACCC GCGAGGCACC 2220 

ACATTCTCCA TCCTCTCGGA TCTTCACAAT CGCCTGCTGA AGCAAACGTC CAAGACGGGC 2 280 

GTCTTCGTGA GGACCTTGCA GATGGACAAA GTGGAGCAGA GCTACCCTGG CAGGAGCCAC 2 340 

TACTACTGGG ACGAGGACTC AGGGCTGTTG TTCCTGAAGC TGAAAGCTCA GAACGAGAGA 2 400 

GAGAAGTTTG CTTTCTGCTC CATGAAAGGC TGTGAGAGGA TAAAGATTAA AGCTCTGATT 2 460 

CCAAAGAACG CAGGCGTCAG TGACTGCACA GCCACAGCTT ACCCCAAGTT CACCGAGAGG 2520 

GCTGTCGTAG ACGTGCCGAT GCCCAAGAAG CTCTTTGGTT CTCAGCTGAA AACAAAGGAC 2530 

CATTTCTTGG AGGTGAAGAT GGAGAGTTCC AAGCAGCACT TCTTCCACCT CTGGAACGAC 2640 

TTCGCTTACA TTGAAGTGGA TGGGAAGAAG TACCCCAGTT CGGAGGATGG CATCCAGGTG 2 700 

GTGGTGATTG ACGGGAACCA AGGGCGCGTG GTGAGCCACA CGAGCTTCAG GAACTCCATT 2760 

CTGCAAGGCA TACCATGGCA GCTTTTCAAC TATGTGGCGA CCATCCCTGA CAATTCCATA 2B2 0 



GAGAAAGAGC CTTGGCCTTA AGGAAATCTT TACTCCTGTA AGCAAGAGCC AACCTCACAG 354 0 

GATTAGGAGC TGGGGTAGAA CTGGCTATCC TTGGGGAAGA GGCAAGCCCT GCCTCTGGCC 3600 

GTGTCCACCT TTCAGGAGAC TTTGAGTGGC AGGTTTGGAC TTGGACTAGA TGACTCTCAA 3660 

AGGCCCTTTT AGTTCTGAGA TTCCAGAAAT CTGCTGCATT TCACATGGTA CCTGGAACCC 372 0 

AACAGTTCAT GGATATCCAC TGATATCCAT GATGCTGGGT GCCCCAGCGC ACACGGGATG 3780 

GAGAGGTGAG AACTAATGCC TAGCTTGAGG GGTCTGCAGT CCAGTAGGGC AGGCAGTCAG 3840 

GTCCATGTGC ACTGCAATGC CAGGTGGAGA AATCACAGAG AGGTAAAATG GAGGCCAGTG 3900 

CCATTTCAGA GGGGAGGCTC AGGAAGGCTT CTTGCTTACA GGAATGAAGG CTGGGGGCAT 3960 

TTTGCTGGGG GGAGATGAGG CAGCCTCTGG AATGGCTCAG GGATTCAGCC CTCCCTGCCG 4020 

CTGCCTGCTG AAGCTGGTGA CTACGGGGTC GCCCTTTGCT CACGTCTCTC TGGCCCACTC 4080 

ATGATGGAGA AGTGTGGTCA GAGGGGAGCA ATGGGCTTTG CTGCTTATGA GCACAGAGGA 4140 

ATTCAGTCCC CAGGCAGCCC TGCCTCTGAC TCCAAGAGGG TGAAGTCCAC AGAAGTGAGC 42 00 

TCCTGCCTTA GGGCCTCATT TGCTCTTCAT CCAGGGAACT GAGCACAGGG GGCCTCCAGG 42 6 0 

AGACCCTAGA TGTGCTCGTA CTCCCTCGGC CTGGGATTTC AGAGCTGGAA ATATAGAAAA 432 0 

TATCTAGCCC AAAGCCTTCA TTTTAACAGA TGGGGAAAGT GAGCCCCCAA GATGGGAAAG 4380 

AACCACACAG CTAAGGGAGG GCCTGGGGAG CCCCACCCTA GCCCTTGCTG CCACACCACA 4440 

TTGCCTCAAC AACCGGCCCC AGAGTGCCCA GGCACTCCTG AGGTAGCTTC TGGAAATGGG 4500 

GACAAGTCCC CTCGAAGGAA AGGAAATGAC TAGAGTAGAA TGACAGCTAG CAGATCTCTT 4560 

CCCTCCTGCT CCCAGCGCAC ACAAACCCGC CCTCCCCTTG GTGTTGGCGG TCCCTGTGGC 4 62 0 

CTTCACTTTG TTCACTACCT GTCAGCCCAG CCTGGGTGCA CAGTAGCTGC AACTCCCCAT 4680 

TGGTGCTACC TGGCTCTCCT GTCTCTGCAG CTCTACAGGT GAGGCCCAGC AGAGGGAGTA 4740 

GGGCTCGCCA TGTTTCTGGT GAGCCAATTT GGCTGATCTT GGGTGTCTGA ACAGCTAITG 4800 

GGTCCACCCC AGTCCCTTTC AGCTGCTGCT TAATGCCCTG CTCTCTCCCT GGCCCACCTT 4860 

ATAGAGAGCC CAAAGAGCTC CTGTAAGAGG GAGAACTCTA TCTGTGGTTT ATAATCTTGC 4920 

ACGAGGCACC AGAGTCTCCC TGGGTCTTGT GATGAACTAC ATTTATCCCC TTTCCTGCCC 4980 

CAACCACAAA CTCTTTCCTT CAAAGAGGGC CTGCCTGGCT CCCTCCACCC AACTGCACCC 504 0 

ATGAGACTCG GTCCAAGAGT CCATTCCCCA GGTGGGAGCC AACTGTCAGG GAGGTCTXTC 5100 

CCACCAAACA TCTTTCAGCT GCTGGGAGGT GACCATAGGG CTCTGCTTTT AAAGATATGG 5160 
CTGCTTCAAA GGCCAGAGTC ACAGGAAGGA C 



CTAATGCAAG GGTCTCACAC TGTGAACCAC TTAGGATGTG ATCACITTCA GGTGGCCAGG 5340 

AATGTTGAAT GTCTTTGGCT CAGTTCATTT AAAAAAGATA TCTATTTGAA AGTTCTCAGA 54 00 

GTTGTACATA TGTTTCACAG TACAGGATCT GTACATAAAA GTTTCTTTCC TAAACCATTC 5460 

ACCAAG AGC C AATATCTAGG CATTTTCTTG GTAGCACAAA TTTTCTTATT GCTTAGAAAA 5520 

TTGTCCTCCT TGTTATTTCT GTTTGTAAGA CTTAAGTGAG TTAGGTCTTT AAGGAAAGCA 5580 



354 



WO 02/086443 

ACGCTCCTCT GAAATGCTTG TCTTTTTTCT GTTGCCGAAA TAGCTGGTCC TTTTTCGGGA 5640 
GTTAGATGTA TAGAGTGTTT GTATGTAAAC ATTTCTTGTA GGCATCACCA TGAACAAAGA 5700 
TATATTTTCT ATTTATTTAT TATATGTGCA CTTCAAGAAG TCACTGTCAG AGAAATAAAG 5760 
AATTGTCTTA AATGTCAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAA 



PCT/US02/12476 



ein Accession ft: 



35 
40 



HMGQQLVGQY 
DCNAVSTFWM 
AYKNQDHGAW 



PIHFKLAGDV 
ANPNNNLINC 
LRGGDVWLDS 



DHCLGLLVKS 



NEREKPAFCS 



CINVPDWRGA 
WTLQKGYTI 
KTGVFVRTLQ 
ALIPKNAGVS 
WNDFAYIEVD 
NSIVLMASKG 
KIFQWPIPV 



1CSGCYAQMY 
HWDQTAPAEL 
MDKVEQSYPG 
DCTATAYPKF 
GKKYPSSEDG 
RYVSRGPWTR 



YPGYIPKPRQ 
PSVGMYSPGY SEHIPLGKFY 
PHQDADPLXF REPAIIRKFI 
DDGSKQEIKN SLFVGESGNV 
NIQNCTFRKF VALEGRHTSA 
NQLDMDGDKT SVFKDVDGSV 
IQAYKTSNLR MXIIKNDFPS 
AIWLIKFNXG DWIRVGLCYP 
RSHYYWDEDS GLLFLKLKAQ 
TERAWDVPM PKKLFGSQLK 
IQVWIDGNQ GRWSHTSFR 
VLEKLGADRG LXLKEQMAFV 



Seq ID NO: 452 DNA sequence 

Nucleic Acid Accession ft: Eos sequence 

Coding sequence; 261.. 2861 



ACGTCCGGGG 
AGAGGGAGCA 
TGCTGACCAT 
CTGGGTGCCC 
ACCATGTGCA 



TGCGAACCCG 
GCCCTTTCCA 
CGGATCCTTA 
ATGGACAGAA 
CAGAAGGAGG 



AAGAGAGTGA 
TTGCAGTGAA 
AATTGGGAAG 
TGAAAGGAAA 
CTGCTGCTGC 



1 

' TCAAGCAGAG 
AGGCAGCGGG 
CCGCTTGCCC 
CCGCTGCGCT 
CACTGCCAGG 
CAGCTGGCTC 
TGACCAGAGC 
TATCGGCCAG 
CTCAGAGGGA 
GCACATCCTG 
GGGCAATTTC 
CTATGGTCTG 
AAAGCTCTCC 
CTATTTTTTT 
ATCAGGCACA 
ACGTCTGGTC 
TGATGAAGGT 



GCTGAGCGCG 



CCTGGCCCGC 
ATGGGAGCTG 
ACTCTGACCT 
CCTGAGTTGC 
GGCAAGACAC 



TGCTATCGGA 
GCCAGGGTCT 
AGCTCGCGGC 



CTGGGAGGCA 
GCTTCCCTCG 
AACCCTGGAA 
TGCTGCTCAC 
TCATTAAAGA 
GAGGAGAGCT 
TGTATGGAAG 



ACTGTCTCGG 
GGACTTCCTC 
GGCCACATCC 



ATTGACAACG 
ACCATCATTT 
AAGTACATTG 
TGGACATTTC 
GAAAGGAGCT GGGGCCACCG 
GTCATCCATT 



GGCTGA7GAA 
AGGAGGCGCT 
CCTTCACCCA 
TGGAGT7ATT 
TGACACCTAT 



CTACAGACCC 
TTCAAGGCCA 
ACAGTGGCTG 
GACCAAGACC 
ACGG7CTATT 
CCGATTGTTT 
AGTGCCCTCT 
GGTATTCAGC 
CTTGAGTTGC 



TCCATCATCT 



CTGCACCTTG GATTTAGACA C 



GTTCATGTCA 
AGATCCAAGA 
ATCCTTTCTG 
GC3ATGACCA 
TTTCTAACTG 



TATCTCAGAC 
TATGCAATCG 
AGGTTGTCTA 



TGAGTGGGTT 



CAAAAAAGGC 



CCAT1GACAC 
AACCTGGAGA 
TCCAGGTGCT 
TGTACCTGCA 
TGAGCCGGAA 
ACATCTGCAA 
TTAAGGCAGC 
AGTACCCGAT 
CCACATACAT 
GCTCCAATGG 
TCACGGAAGA 
AGTCTGGAAC 
ACTCCTACCC 
GGATGGCCAA 
GATTTTGGTT 
GTTATTCAGA 
GGGCTGGCAT 
GGCCGTTCCT 
AGCCCCGGGA 



TGGCGAATAT TTCAATGTTT 
GTGGTTCQAT CATGATAAAG 
GAAAGCTCAC CCAGGAAAAA 
ATACAGGCCA CTACAATGGA TGGAGTTAAC CTCAGCACCG 
CAGGATTATA GGTTTGCTTG CTACGACCGG GGCAGAGCCT 
TTCCTCTGTG GGAAGCCTGT GAGGCCCAAA CTCACAGTCA 
AGCACCATTC TGAACTTGGA GGATAATGTA CAGTCATGGA 
ATTGCCAGTA CTGATTACTC CATGTACCAG GCAGAAGAGT 
CCAACCAGGT CAAAGTGGCA GGGAAACCAA 



3 ATGGGGGAGA T 



A ATGCTACCCC T 



ACACTTGGAG 
TCACTTCCAC 
CAGGGACCTC 
CTTGTTGATC 



CTGGCCGGTG 
TCCATCCATC 
AAGGACGTTG 



ATGTAGACGA 



AAGGCTTCTT 
CCTCTGGAAT 
CGGGGTCGCC 
GGGAGCAATG 
CTCTGACTCC 



CCTCCTCCCC 
AGGGTACATC 
TCCCAACAAC 
TATTTTTCAC 
GCACATTCCA 
GATCATAGAC 
CTCAATCATC 
GCCGGCCATC 
CGGCGGGGAT 
GCTTACAGGA 
GGCTCAGGGA 
CTTTGCTCAC 



TCGGACCGTG 
CCCAAGCCCA 
AACCTCATCA 
CACGTACCAA 
CTGGGAAAAT 
AACGGAGTCA 
TCTGCCAGAT 
ATCAGACACT 



GGCAAGACTG 
ACTGTGCCGC 
CGGGCCCCTC 
TCTATAACAA 
AAACCACCGA 
ACAGCCCTCA 
TCATTGCCTA 



GGGACAGCAG 
AAGGGGAGGT 
TCGCTGCGTC 
CTCTTTGGGC 
CTGTCTTGGC 
GTGCAAGATG 
CAATGCTGTG 
TGCAGGATCT 
CGTGGGAATG 
CCX3AGCACAT 
GGCCTCTGCC 



TATGACCCAC 



CACTGCTTCT 
CTCCTTGTCA 
ATCACAGAGG 
TCCACCTTCT 



A CAAGAACCAG 
TTCAGAGGG 
T GCTGGGGGGA 
G CCTGCTGAAG 
G ATGGAGAAGT 
T CAGTCCCCAG 
C TGCCTTAGGG 
A CCCTAGATGT 

CCTCGGCCTG GGATTTCAGA GCTGGAAATA TAGAAAATAT CTAGCCCAAA 



TACTCCCCAG 
TCCAACTACC 
AAGGACAAGC 
GACCCGCTGA 



AAGAGGGTGA 



CTTATGAGCA 
AGTCCACAGA 
CACAGGGGGC 



3 GGGGCATTTT G 
C CCTGCCGCTG C 
3 CCCACTCATG A 



GAGGCTCAGG 



2100 
2160 

2280 



2580 

2700 

2820 
2880 

3000 
3050 
3120 



355 
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TAACAGATGG GGAAAGTGAG 
TGGGGAGCCC CACCCTAGCC 
GTGCCCAGGC ACTCCTGAGG 
AAATGACTAG AGTAGAATGA 
AACCCGCCCT CCCCTTGGTG 
AGCCCAGCCT GGGTGCACAG 
TCTGCAGCTC TACAGGTGAG 
CCAATTTGGC TGATCTTGGG 
TGCTGCTTAA TGCCCTGCTC 
TAAGAGGGAG AACTCTATCT 
GTCTTGTGAT GAACTACATT 
AGAGGGCCTG 
TTCCCCAGGT 
GGGAGGTGAC CATAGGGCTC 



GGGAAAGAAC 
CACCACATTG 
AAATGGGGAC 
ATCTCTTCCC 
CTGTGGCCTT 
TAGCTGCAAC TCCCCATTGG 
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TCCTGCTCCC 
CACTTTGTTC 
TGCTACCTGG 



TGTCTGAACA 
TCTCCCTGGC 
GTGGTTTATA 
TATCCCCTTT 



GGGAGTAGGG 
GCTATTGGGT 
CCACCTTATA 
ATCTTGCACG AGGCACCAGA 
CCTGCCCCAA 
TGCACCCATG 
GTCTTTCCCA 
GATATGGCTG 



AG CG CACACA 
ACTACCTGTC 
CTCTCCTGTC 
TTC7GGTGAG 
CCC7TTCAGC 
AGAGCTCCTG 




CCACAAACTC 
AGACTCGGTC 
CCAAACATCT 
C"TCAAAGGC 
AGAGTTAAAA 
ATGCAAGGGT 
GTTGAATGTC 
GTACATATGT 



TTTCCTTCAA 
CAAGAGTCCA 
TTCAGCTGCT 



TGACCTCATG 
CTCACAC-GT 
TTTGGCTCAG 
TTCACAGTAC 



T GCCGAAATAG 
TGTAAACATT TCTTGTAGGC 
ATGTGCACTT CAAGAAGTCA 
GAGATGTCCT TTGCATTGCT 
r TTTGCTGTTA 



ATCACCATGA 
CTGTCAGAGA 
TGGAAGGGGT 



GAAAGCAACG CTCCTCTGAA ATGCTTGTCT 
AGATGTATAG AGTGTTTGTA 
ACAAAGATAT ATTTTCTATT TATTTATTAT 
AATAAAGAAT 7GTCTTAAAT GTCATGATTG 
GTACCTAGAG CCAAGGAAA? TGGCTCTGGT 
CATACAAAGG ATGTCAAAAA AAAAAAAAAA 



30 
35 
40 
45 
50 



MGAAGRQDFL 
GKTLLLTSSA 
TIILYGRADE 



PKAMLTISWL 
TVYSXHISEG 
GIQPDPYYGL 



AMTKLGSKHF 
FNVSLSSEWV 
LSTEWYKKG 
QSWKPGDTLV 
VGLLSRNIIV 
LVGQYPIHFH 



GKLVIKDHDE 
KYIGVGKGGA 
VIHSDRFDTY 



KLFQTEHGEY 
I QATTMDGVN 
STILNLEDNV 
EIDGVDMRAE 
GTELKHMGQQ 
KDWGYNSLG 
PKPRQDCNAV 
LGKFYNMRBH 
IRHFIAYKNQ 
FSPPCRCIAK 



Seq ID NO: 454 DNA sequence 

Nucleic Acid Accession #: NM_013282 .2 

Coding sequence: 85..246S 



QDVEWTEWFD 
QDYRFACYDR 
IASTDYSMYQ 
MGEMEDKCYP 
LAGDVDERGG 
ERNTFDHCLG 
STFWMANPNN NLINCAAAGS 
SNYRAGMI ID NGVKTTEASA 



FLTVKGNPSS 
HDKVSQTKGG 
GRACRSYRVR 
AEEFQVLPCR 
YRNHICNFFD 
YDPPTYIRDL 
LLVK3GTLLP 
EETGFWFIFH 
KDKRPFLSII 
EAQEGFLLTG 



PEIiQPWNPGH 
IDNGGELHAG 
WTFLNKTLIIP 
QYLNAVPDGR 



EKISDLWKAH 
FLCGKPVRPK 
SCAPNQVKVA 
FDTFGGHIKF 



I 

DQDHHVHIGQ 
SALCPFQGNF 
GGMAEGGYFF 
ILSVAVNDEG 
HRGSAAARVF 
PGKICNRPID 
LTVTIDTNVN 
GKPMYLHIGE 
ALGFKAAHLE 
TVHGSNGLLI 
ITEDSYPGYI 
YSPGYSEHIF 
DPLKPREPAI 
DEAASGMAQG 



CGACTCCTTA 
GTCCCTCCCC 
ACCCACACGG 



GAGGACGGCC 
GTCCGCCAGA 
ACCGACTCCG 



TCAGCGCCGA 
TGGACTCGCT 
TCCACGTGGA 
ATACCCTCTT 



TGGCTCAGAG 
GTCCAGGCTG 



GGGCTGTACA 
GAGGCGCAGG 
ACGTCCAGGC 
GAGAACGGCG 
AAGTGGCAGG 



GCTGCTGCCT 
AGACTGACAG 
AGGTCAATGA 



ATCCAGGTTC 
ACCAAGGTGG 
CAGAGGCTGT 
GTCCGCCTGA 
ACCAAGGAGC 



GAGAACGGCG TGGTCCAGAT 



CGGGAACTCT 
TTCGTGGACG 
CCCATGAGAC 
TGCCGGGTCT 
TGCGATGAGT 



GCTTCTGGTA 
ACGCCAACGT 
AAGTCTTCAA 
GGAAGAGCGG 
GCGCCTGCCA 



CAGGCCAGCC 
GTACGTCGAT 
GACGCGGAAG 
GGAGGACGTC 
GAACTCCAGG 
GGGCCAGGTG 
CGACGCGGAG 
GGTGCTGGGG 
GATTGAGCGG 



ATTTACCACG 
GACGTCCGAG 
GTCATGCTCA 



TCACAGCGGG 



AGCAACGACG 
GGGAATTTTT 
GCGGAACAGT 
TTTGCTCCCA 



ACGAGTGGTA 
GGCTGAGAGA 
ACTGGGGCAA 
ACCACTACGG 
TCAGCGAGTC 
GAGCGTACTC 



AACCGCTACG 
TTTCTCGTGT 
GAGGGGAAGG 



ACCCATCCCG 
GGGTGTCCAT 
CCTAGTCCTG 



CTTGTGATCA GAAACTCACC 
TCAATGACCA AGAAGGGGCC 
TGCGCAATGT CAAGGGTGGC 
ATGGCATCTA CAAGGTTGTG 
GGCGCTACCT TCTGCGGAGG 
ACCGGAT CAA GAAGCTGGGG 



TACTGCCTGG 
TGCCGGAATG 
AAGGCGAAGA 
TGTGTGGGCC 



TCTACAGGGG CAAACAGATG 240 

ATGACACCAT CCAGCTCCTG 300 

GGGACTCCGA GCTCTCCGAC 360 

AGTCCTCCAC CCACGGCGAG 420 

TG1GGGATGA GACGGAATTG 480 

CGAACATGGG GGCGTGGTTT 540 

GGGACGAGCC CTGCAGCTCC 600 

TGAAATACGA CGACTACCCG 660 

CGCGCGCCCG CACCATCATC 720 

ACTACAACCC CGACAACCCC 780 

AGCGCGAGAC CAGGACGGCG 840 
TGAACGACTG 
GGAGCCCCAT 
AGGACGACGT 
ACCCCGACAA 
ACCCGCCCCT 



TGGCCTCGGC 
GCACCAAGGA 
TGGGCACCAT 



GCGGGGGGCT ATGAGGATGA CGTGGACCAT 
GGTCGAGATC TTTCCGGCAA CAAGAGGACC 
AACACCAACA GGGCGCTGGC TCTCAACTGC 
GAGGCCAAGG ACTGGCGG7C GGGGAAGCCG 
AAGAATAGCA AGTACGCCCC CGCTGAGGGC 
AAATACTGGC CCGAGAAGGG GAAGTCCGGG 
GACGATGATG AGCC7GGCCC TTGGACGAAG 
CTGACCATGC AGTATCCAGA AGGCTACCTG 
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GAAGCCCTGG CCAACCGAGA GCGAGAGAAG GAGAACAGCA AGAGGGAGGA GGAGGAGCAG 19 SO 

CAGGAGGGGG GCTTCGCGTC CCCCAGGACG GGCAAGGGCA AGTGGAAGCG GAAGTCGGCA 2040 

GGAGGTGGCC CGAGCAGGGC CGGGTCCCCG CGCCGGACAT CCAAGAAAAC CAAGGTGGAG 2100 

CCCTACAGTC TCACGGCCCA GCAGAGCAGC CTCATCAGAG AGGACAAGAG CAACGCCAAG 2 ISO 

CTGTGGAATG AGGTCCTGGC GTCACTCAAG GACCGGCCGG CGAGCGGCAG CCCGTTCCAG 2220 

TTGTTCCTGA GTAAAGTGGA GGAGACGTTC CAGTGTATCT GCTGTCAGGA GCTGGTGTTC 2280 



CAGGTGAACC AGCCTCTGCA GACCGTCCTC AACCAGCTCT TCCCCGGCTA CGGCAATGGC 2 4 60 

CGGTGATCTC CAAGCACTTC TCGACAGGCG TTTTGCTGAA AACGTGTCGG AGGGCTCGTT 2520 

CATCGGCACT GATTTTGTTC TTAGTGGGCT TAACTTAAAC AGGTAGTGTT TCCTCCGTTC 2580 

CCTAAAAAGG TTTGTCTTCC TTTTTTTTTA TTTTTATTTT TCAAATCTAT ACATTTTCAG 2640 

GAATTTATGT ATTCTGGCTA AAAGTTGGAC TTCTCAGTAT TGTGTTTAGT TCTTTGAAAA 2700 

CATAAAAGCC TGCAATTTCT CGACAAAACA ACACAAGATT TTTTAAAGAT GGAATCAGAA 2760 

ACTACGTGGT GTGGAGGCTG TTGATGTTTC TGGTGTCAAG TTCTCAGAAG TTGCTGCCAC 2 820 

CAACTCTTTA AGAAGGCGAC AGGATCAGTC CTTCTCTAGG GTTCTGGCCC CCAAGGTCAG 2880 

AGCAAGCATC TTCCTGACAG CATTTTGTCA TCTAAAGTCC AGTGACATGG TTCCCCGTGG 2 940 



AAAGAGGAAA CATCTCGGGC CTAGTTCAAA CCTTTGCCTC AAAGCCATCC CCCACCAGAC 
TGCTTAGCGT CTGAGATCCG CGTGAAAAGT CCTCTGCCCA CGAGAGCAGG GAGTTGGGGC 
CACGCAGAAA TGGCCTCAAG GGGACTCTGC TCCACGTGGG GCCAGGCGTG TGACTGACGC 
TGTCCGACGA AGGCGGCCAC GGACGGACGC CAGCACACGA AGTCACGTGC AAGTGCCTTT 
GATTCGTTCC TTCTTTCTAA AGACGACAGT CTTTGTTGTT AGCACTGAAT TATTGAAAAT 
GTCAACCAGA TTCTAGAAAC TGCGGTCATC CAGTTCTTCC TGACACCGGA TGGGTGCTTG 
GGAACCGTTT GAGCCTTATA GATCATTTAC ATTCAATTTT TTTAACTCAG CAAGTGAGAA 
CTTACAAGAG GGTTTTTTTT TAATTTTTTT TTCTCTTAAT GAACACATTT TCTAAATGAA 



TTGTTTTTGT ATTTTTTTTC TTTTGAAAGG GTTTGTTAAT TTTTCTAATT TTACCAAAGT 3600 

TTGCAGCCTA TACCTCAATA AAACAGGGAT ATTTTAAATC ACATACCTGC AGACAAACTG 3660 

GAGCAATGTT ATTTTTAAAG GGTTTTTTTC ACCTCCTTAT TCTTAGATTA TTAATGTATT 3720 

AGGGAAGAAT GAGACAATTT TGTGTAGGCT TTTTCTAAAG TCCAG-ACTT TGTCCAGATT 3780 
TTAGATTCTC AGAATAAATG TTTTTCACAG ATTGAAAAAA AAAAAAAA 

Seq ID NO: 455 Protein sequence 
Protein Accession ft: NP_037414.2 

1 11 21 31 41 51 

I I I I I I 

MWIQVRTMDG RQTHTVDSLS RLTKVEELRR KIQELFHVEP GLQRLFYRGK QMEDGHTLFD 60 

YEVRLNDTIQ LLVRQSLVLP HSTKERDSEL SDTDSGCCLG QSESDKSSTH GEAAAETDSR 120 

PADEDMWDET ELGLYKVNEY VDARDTNMGA WPEAQWRVT RKAFSRDEPC SSTSRPALEE 180 

DVIYHVKYDD YPENGWQMN SRDVRARART 1 1 KWQDLE VG QWNLNYNPD NFKERGFWYD 240 

AEISRKKETR TARELYANVV LGDDSLNDCR IIFVDEVFXI ERPGEGSPMV DKPMRRKSG? 300 

SCKHCKDDVN RLCRVCACHL CGGRQDPDKQ LMCDECDMAF HIYCLDPPLS SVPSEDEWYC 360 

PECRNDASEV VLAGERLRES KKKAKMASAT SSSQRDWGKG MACVGRTKEC TIVPSNHYG? 420 

IPGIPVGTMW RFRVQVSESG VHRPHVAGIH GRSNDGAYSL VLAGGYEDDV DHGNFFTYTG 480 

SGGRDLSGNK RTAEQSCDQK LTNTNRALAL NCFAPINDQE GASAKDWRSG XPVRWRNVX 540 

GGKNSKYAPA EGNRYDGIYK WKYWPEKGK SGFLVWRYLL RRDDDEPGPW TKEGXDRIKX 600 

LGLTMQYPEG YLEALANRER EKENSKREEE EQQEGGFASP RTGKGKWKRK SAGGGPSRAG 660 

SPRRTSKKTK VEPYSLTAQQ SSLIREDKSN AKLKNEVLAS LKDRPASGEP FQLFLSKVEE 720 
TFQCICCQEL VFRPITTVCQ HNVCKDCLDR SFRAQVFSCP ACRYDLGRSY AMQVNQPLQT 



1 11 21 31 41 51 

I I I I I I 

GGGGACTTCT TGAACTTGCA GGGAGAATAA CTTGCGCACC CCACTTTGCG CCGGTGCCTT 60 

TGCCCCAGCG GAGCCTGCTT CGCCATCTCC GAGCCCCACC GCCCCTCCAC TCCTCGGCCT 120 

TGCCCGACAC TGAGACGCTG TTCCCAGCGT GAAAAGAGAG ACTGCGCGGC CGGCACCCGG 180 

GAGAAGGAGG AGGCAAAGAA AAGGAACGGA CATTCGGTCC TTGCGCCAGG TCCTTTGACC 240 

AGAGTTTTTC CATGTGGACG CTCTTTCAAT GGACGTGTCC CCGCGTGCTT CTTAGACGGA 300 

CTGCGGTCTC CTAAAGGTCG ACCATGGTGG CCGGGACCCG CTGTCTTCTA GCGTTGCTGC 360 

TTCCCCAGGT CCTCCTGGGC GGCGCGGCTG GCCTCGTTCC GGAGCTGGGC CG CAGGAAGT 420 

TCGCGGCGGC GTCGTCGGGC CGCCCCTCAT CCCAGCCCTC TGACGAGGTC CTGAGCGAGT 480 

TCGAGTTGCG GCTGCTCAGC ATGTTCGGCC TGAAACAGAG ACCCACCCCC AGCAGGGACG 540 

CCGTGGTGCC CCCCTACATG CTAGACCTGT ATCGCAGGCA CTCAGGTCAG CCGGGCTCAC 600 

CCGCCCCAGA CCACCGGTTG GAGAGGGCAG CCAGCCGAGC CAACACTGTG CGCAGCTTCC 660 

ACCATGAAGA ATCTTTGGAA GAACTACCAG AAACGAGTGG GAAAACAACC CGGAGATTCT 720 

TCTTTAATTT AAGTTCTATC CCCACGGAGG AGTTTATCAC CTCAGCAGAG CITCAGGTTT 780 

TCCGAGAACA GATGCAAGAT GCTTTAGGAA ACAATAGCAG TTTCCATCAC CGAATTAATA 840 

TTTATGAAAT CATAAAACCT GCAACAGCCA ACTCGAAATT CCCCGTGACC AGACTTTTGG 900 

ACACCAGGTT GGTGAATCAG AATGCAAGCA GGTGGGAAAG TTTTGATGTC ACCCCCGCTG 960 

TGATGCGGTG GACTGCACAG GGACACGCCA ACCATGGATT CGTGGTGGAA GTGGCCCACT 1020 

TGGAGGAGAA ACAAGGTGTC T CC AAG AG AC ATGTTAGGAT AAGCAGGTCT TTGCACCAAG 1080 

ATGAACACAG CTGGTCACAG AT AAGG CC AT TGCTAGTAAC TTTTGGCCAT GATGGAAAAG 1140 

GGCATCCTCT CCACAAAAGA GAAAAACGTC AAGCCAAACA CAAACAGCGG AAACGCCTTA 1200 

AGTCCAGCTG TAAGAGACAC CCTTTGTACG TGGACTTCAG TGACGTGGGG TGGAATGACT 1260 

GGATTGTGGC TCCCCCGGGG TATCACGCCT TTTACTGCCA CGGAGAATGC CCTTTTCCTC 1320 

TGGCTGATCA TCTGAACTCC ACTAATCATG CCATTGTTCA GACGTTGGTC AACTCTGTTA 1380 

ACTCTAAGAT TCCTAAGGCA TGCTGTGTCC CGACAGAACT CAGTGCTATC TCGATGCTGT 1440 

ACCTTGACGA GAATGAAAAG GTTGTATTAA AGAACTATCA GGACATGGTT GTGGAGGGTT 1500 

GTGGGTGTCG CTAGTACAGC AAAATTAAAT ACATAAATAT ATATATA 

Seq ID NO: 457 Protein sequence 
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MVAGTRCLLA LLLPQVLLGG AAGLVPELGR RKFAAASSGR PS3QPSDEVL SEFE1RLLSM 

5 FGLKQRPTPS RDAWPPYML DLYRRHSGQP GSPAPDHRLE RAASRANTVR SFHHEESLEE 

LPETSGKTTR RFFFNLSSIP TEEFITSAEL QVFREQMQDA LGNNSSFHHP. I" 
TANSKFPVTR LLDT 

Seq ID NO: 458 DNA sequence 
10 Nucleic Acid Accession #: NM_001999.2 
Coding sequence: 1..873 6 



I I 
ATGGGGAGAA GACGGAGGCT 
CTCTGGGCGC AGGGCACGGC 
CAGCCGCCGC CGCAACAGGT 
CCCGAGTATC GCGAGGAGGG 
GACGTGCTCC GAGGGCCCAA 
TGGAAGACGC TCCCTGGAGG 



CGTGTGCGGC 
AAACCAGTGC 
TAACATGTGT 



CAGCCTCCTC 
ACAGCAGGCT 
GCCAGCCGCG 
TCCAGATTCC 
ATTGTCCCGA 



CGCCCAAGC 
CTGAAGGCGG GTTTCTAGCG 
TCCGCCG3CG AGGACAGCAG 
ACTCCTACTG CTGCCCTGGA 
TTTGTAGAAA TAGTTGTGGA 




AGATGCCCTG 
TGCAGCATCA 
TTTTGTGTTT 
AGAACAGGCA 
AGAATGACGA 
CCTGAAGCCT 
CCAATGGGAG 
TTTGCCCCAA 
CCTGGAGGCA 
GGACCTATCA 



GCTAACCTTT 



AAATGCAGTG 
GTCCTGTCAG 
GAATTCCAGG 
GTGGCAATGG 
ATGGCTTTTC 
TCACTGGACT 



CTGCTGTGAG 
AGGTTCTGAG 
GAGTGCTGGT 
CAATGGCTAT 
TCCTGGCGTT 
AACAATTCTG 



AATGGGGTTC 
TGCAATGCCG 
ACAACTACCA 
ATCTGCAAAC 
TGCCAGACCC 
TGTGACTGTC 
ATGCGCAGTA 



ATAAGCAGGA 
CTAATGGAGA 
AGAGGACTCC 
TTTGTAAAAA 
GCTTTGAATT 
ACATGTGTTT 
CAGGATTTGT 
CAGGAATCTG 



3 ACGCTGTATA C 



GCTCTCGATG 
GTGCACAAGA 
GCTGGGGCAT 
GACTTTGCAT 
GAGGCACTGG 
GGACAGGCTT 
GGGGGAGCCG GTGTGGGGGC 
TAGATATCTG 
CAAGCTACCG 
ATGTTGATGA 



GGATGGACTT 
GGGAAATGGC 
CATCCCCATC 
CGGGGGACAG 
TAAGCATCAT 
ATGTGAATGC 
ATGCACATCA 
TAAATGTCAT 



CGGTCGATGC G' 



G ATGGAAGTT7 CCAGTGCATT 



3 TGCATCAATG 
A AATGGGCGTT 
3 CACTGCATCA 
Z ATGGATGGAC 



AAGATGGCAG 
ACTGTACTGA 
ACAGTGAAGG 
GTGTGTGTGT 



T ATGGTTTTGG 



CAGCCATGCC 
ATCACTGTGG 
GGGATTTGTG 
GATGCCTCTG 



AAACTCAGCT CCACAGGATT 
ATCCAGGACA GCCGCTGTGA 
GCCACCCTCG GAGCCGCCTG 



GAATGGAATG 
CTTGGCTCCA 
CATGAATGGG 
GGCTGTGGGC 
AGGAATCAAG AAAGGAGTGT GTGTGCGICC 
CTGCTGTGCC 
TTCAGCTGAA 

AT CCTGATAT 
GCAACAGTGG 
TAGTAAACAG 
CACGC CAGGA AGTTACAGCT GTACGTGCCC 
GACCTGTGAA GATATAAATG AATGTGAAAG 
CAACCTTGGA TCTTTCAATT 



CTTCAAGTGC 
TGTTGATGAA 
GTCCTTCCGC 
TGATACTCAC 



3 TGGTAGTTAC CGTTGTAATT 



AGAACCCTGC 
TGGAGTAGGT 
ATGTGCCAAT 
CTATGAACCA 
ACTGCTTTGT 
ACCAGGGTAT 
CAACCCATGT 
GCCCGGCAGC 



TTCCCTGGCG T 



T GTCAACAGTA 



GGGATGTGCA 
AGTGGCTTTG 
TCTCCTGACC 
TGCTTCGAAG 
TGTGAACGTA 
CAGTGTGACT 
AATGAATGCT 
AC CTATCAGT 
GATATTGATG 
GGAAGCTACG 
GCAGACATTG 
ATTCCTGGAG 
ACATGCATTG 
GAGAACACAA 
ACCACAGGAT 



GCTGCTGTGC 
CCAAGGAATA 
TTACTGGGCG 
CTTATGGGAA 
CTCTAGACAT 



GCTATGAAAG 
ACCCTCTCCT 
GCCCACTGGG 
CCCTGAGTGA 



TGTCGGGGCG 
CGAGACACTG 
GCCATTTIAC 
GTGCAGAAAT 
GGAGGAAAGA 
TGGAATCTGC 
TGGCTTCATG 
TTGTAGGGGT 
ACACGAGCTG 
CAATCTCTGC 



ACAATCGGAA 
AACTGCACGG 
GTCAATACAC 
ATGATGAAGA 
GGCACCTGTG 
TCACCATCCC 



AGGGATCTTT TCATTGCGAG 
GTTTGGATAT TCGCATGGAG 
CCGTTCCTGG AAAGTTCCGC 
CCGAGTGTGA GGAGTGCCCC 
GGGCTGGCTT TGCTAACCGA 
ATGAATGCAA AGCATTTCCT 



TGAACACTGA 
GTGAGGACTG 
AATGTGTGAA 



AATGTATGAT 
AATGCAGCTG 
ATGAATGTGA 



ATGT CAATGA 
AGGGATCCTT 
GTACAGATGT 



AATGAACGGA 
CAGTGAGGGT 
AAACAATCCT 
CCTCTGCTAT 



AACGGCATCA 
GGTGATGGCT 



AGTGTATTGA 



TTACCTGCTC 



GGATGAGTGT 
AGGAAGCTTC 
TCTGGACGAA 
CCCGGGCTCA 
AGATGTTGAT 



GGCTGTGACA 
TATGCCCTGA 
GATATCTGTG 
GATGGCTTCA 
AATTCAAATA 
TGTCAGCTGG 
GAAATTGGTG 
AAGTGTAGCT 
TGTTCTAATG 
TACCGCTGTG 
GAGTGTGCAG 



CCCAGTGCAC 
TGCCAGATGG 
ATGGCGGCCA 
TGGCTTCCAT 
TCTGCATGT? 
GTIACTCAGT 
CTCATAACTG 



GGGCAGCTTT 
TGTGGATATT 
CATGATTGGA 
GGGCTGTACA 
AAATTCAGAG 
GAGATCGTGT 
GTGTACCAAC 
GGACATGAAA 



GAACCCACCA GTGTAGCATC 
CCTGCTCCGA AGGTTTCACT 
AAAACATAAA CCTCTGTGAG 



3420 
3480 
3540 
3S00 

3720 
3780 

3900 
3960 
4020 
4080 
4140 
4200 
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AACGGACAGT GCCTTAATGT CCCGGGTGCA T. 
CCAGCCTCAG ACAGCAGATC CTGCCAS 

GTCTCTGGAA CATGTAATAA CCTGCCTGGA ATGTTTCATT GCATCTGCGA TGATGGTTAT 4560 

GAATTGGACA GAACAGGAGG GAACTGTACA GATATTGATG AGTGTGCAGA TCCTATAAAC 4 620 

TGTGTCAATG GCCTATGTGT CAACACGCCT GGTGGCTATG AGTGTAACTG CCCACCCGAT 4680 

TTTCAGTTGA ACCCAACTGG TGTGGGTTGT GTTGACAACC GTGTGGGCAA CTGCTACCTG 4740 

AAGTTTGGAC CTCGAGGAGA TGGGAGTCTG TCTTGCAACA CCGAGATCGG GGTGGGCGTC 4 80 0 

AGTCGCTCTT CATGCTGCTG CTCTCTGGGA AAGGCCTGGG GAAACCCCTG TGAGACATGC 4860 

CCCCCTGTCA ATAGCACTGA ATATTACACC CTGTGTCCCG GAGGTGAAGG CTTCAGACCT 4920 

AACCCCATCA CAATCATTTT AGAAGACATT GACGAATGCC AGGAGTTACC AGGTCTCTGC 4980 

CAGGGTGGAA ACTGCATCAA CACTTTTGGG AGCTTCCAGT GTGAGTGCCC ACAAGGCTAC 5040 

TACCTCAGCG AGGATACCCG CATCTGTGAG GATATTGATG AGTGTTTTGC ACATCCTGGT 5100 

GTGTGTGGGC CTGGGACCTG CTATAACACC CTGGGAAATT ACACCTGCAT TTGCCCACCT 5160 

GAGTACATGC AGGTCAATGG AGGCCACAAC TGCATGGACA TGAGAAAAAG CTTTTGCTAC 5220 

CGAAGCTATA ATGGAACCAC TTGTGAGAAT GAGTTGCCTT TCAATGTGAC AAAAAGGATG 5280 

TGCTGCTGCA CATATAATGT GGG CAAAGCT GGGAACAAAC CTTGTGAACC ATGCCCAACT 534 0 

CCAGGAACAG CTGACTTTAA AACCATATGT GGAAATATTC CTGGATTCAC CTTTGACATT 5400 

CACACAGGAA AAGCTGTTGA CATTGATGAA TGTAAAGAGA TTCCAGGCAT TTGTGCAAAT 5460 

GGTGTGTGCA TTAACCAGAT TGGCAGTTTC CGCTGTGAAT GCCCTACAGG ATTCAGTTAC 5520 

AATGACCTGC TGTTGGTTTG TGAAGATATA GATGAGTGCA GCAATGGTGA 7AATCTCTGC 5580 

CAGCGGAATG CAGACTGCAT CAATAGTCCT GGTAGTTACC GCTGTGAATG TGCCGCGGGT 5640 

TTCAAACTTT CACCCAATGG GGCCTGTGTA GATCGCAATG AATGTTTAGA AATTCCTAAC 5700 

GTTTGCAGTC ATGGCTTGTG TGTTGATCTG CAAGGAAGTT ACCAGTGCAT CTGCCACAAT 5760 

GGCTTTAAGG CTTCTCAGGA CCAGACCATG TGCATGGA7G TTGATGAGTG CGAGCGGCAC 5820 

CCATGTGGAA ATGGAACTTG T AAAAACACC . GTTGGATCCT ATAACTGTCT GTGCTACCCA 5880 

GGGTTTGAAC TCACTCATAA TAATGATTGC CTGGACATAG ATGAGTGCAG TTCCTTTTTT 5940 

GGTCAGGTGT GCAGAAATGG ACGTTGTTTT AATGAAATTG GTTCTTTCAA GTGTCTATGT 6000 

AACGAAGGTT ATGAACTTAC CCCAGATGGC AAAAACTGTA TAGACACTAA TGAGTGTGTC 6060 

GCCCTTCCCG GCTCTTGCTC TCCTGGTACC TGTCAGAATT TGGAGGGATC CTTCAGATGC 6120 

ATCTGTCCCC CAGGGTATGA AGTAAAAAGC GAGAACTGCA TTGATATAAA TGAATGTGAT 6180 

GAAGATCCCA ACATTTGTCT TTTTGGTTCC TGTACTAATA CTCCAGGGGG CTTCCAGTGC 6240 

CTCTGCCCCC CTGGCTTTGT ACTATCTGAT AATGGACGGA GATGCTTTGA TACTCGCCAG 63 00 

AGCTTCTGCT TCACAAATTT TGAAAATGGA AAGTGTTCTG TACGCAAAGC TTTCAACACC 6360 

ACAAAAGCAA AATGCTGCTG TAGTAAGATG CCAGGAGAGG GCTGGGGGGA CCCCTGTGAG 6420 

CTGTGCCCCA AAGACGATGA AGTTGCATTT CAGGATTTGT GTCCATATGG CCATGGAACT 64B0 

GTCCCTAGTC TTCATGATAC ACGTGAAGAT GTCAATGAGT GTCTTGAGAG CCCAGGCATT 6540 

TGTTCAAATG GTCAATGTAT CAACACCGAC GGATCTTTTC GCTGTGAATG TCCAATGGGC 6600 

TACAACCTTG ACTACACTGG AGTACGCTGT GTGGATACTG ATGAGTGTTC AATCGGCAAT 6 660 

CCGTGTGGAA ATGGTACATG CACCAATGTT ATTGGGAGTT TTGAATGCAA TTGCAATGAA 6720 

GGCTTTGAGC CAGGGCCCAT GATGAATTGT GAAGATATCA ACGAATGTGC CCAGAACCCA 6780 

CTGCTGTGTG CTTTACGCTG CATGAACACT TTTGGGTCCT ATGAATGCAC GTGCCCGATT 6840 

C TCAGGGAAGA TCAAAAGATG TGCAAAGATC TGGATGAATG TGCTGAAGGG 6 900 

IATCTAG GGGCATGATG TGTAAGAATC TAATCGGCAC CTTCATGTGC 69S0 

C CTGGAATGGC CCGAAGGCCC GATGGAGAAG GCTGTGTAGA TGAAAATGAA 7 0 20 

TGCAGGACCA AGCCAGGAAT CTGTGAAAAT GGACGTTGTG TTAACATTAT TGGAAGCTAT 7080 

AGATGTGAGT GTAATGAAGG ATTCCAGTCA AGTTCTTCAG GCACTGAATG CCTTGACAAT 7140 

CGACAGGGTC TCTGCTTTGC AGAGGTACTG CAGACAATAT GTCAAATGGC ATCCAGTAGT 7200 

CGCAATCTCG TCACTAAGTC AGAATGCTGC TGTGATGGTG GGCGAGGCTG GGGCCACCAG 7260 

TGCGAGCTTT GCCCACTTCC TGGAACTGCC CAGTACAAAA AGATATGTCC TCATGGCCCA 73 2 0 

GGATATACAA CTGATGGAAG AGATATTGAT GAATGTAAGG TAATGCCAAA CCTCTGCACC 7380 

AATGGTCAGT GCATCAATAC CATGGGCTCA TTCCGATGCT TCTGCAAGGT TGGCTACACC 7440 

ACAGACATCA GTGGAACCTC TTGTATAGAC. CTTGATGAAT GCTCCCAGTC CCCGAAACCA 7500 

TGCAACTACA TCTGCAAGAA CACTGAGGGG AGTTATCAGT GTTCATGTCC GAGGGGGTAT 7560 

GTCCTGCAAG AGGATGGAAA GACATGCAAA GACCTTGATG AATGTCAAAC AAAGCAGCAT 7620 

AACTGCCAGT TCCTCTGTGT CAACACCCTG GGGGGGTTTA CCTGTAAATG TCCACCTGGT 7680 

TTCACACAGC ATCACACTGC TTGTATCGAC AACAACGAAT GTGGGTCTCA ACCTTTGCTT 7740 

TGTGGAGGAA AGGGAATCTG TCAAAACACT CCAGGCAGTT TCAGCTGTGA ATGCCAAAGA 7800 

GGGTTCTCTC TTGATGCCAC CGGACTGAAC TGTGAAGATG TTGATGAATG TGATGGGAAC 7360 

CACAGGTGCC AACACGGCTG CCAGAACATC CTGGGTGGCT ACAGATGTGG CTGCCCCCAA 7920 

GGCTACATCC AGCACTACCA GTGGAATCAG TGTGTCGATG AGAATGAATG CTCCAATCCC 7 980 

AATGCCTGTG GCTCTGCTTC CTGCTACAAC ACCCTGGGGA GTTACAAGTG CGCCTGCCCC 8040 

TCGGGGTTCT CCTTCGACCA GTTCTCCAGT GCCTGCCACG ACGTGAATGA GTGCTCGTCC 8100 

TCCAAGAACC CCTGCAATTA CGGCTGCTCT AACACGGAGG GGGGCTACCT CTGTGGCTGC 8160 

CCCCCTGGGT ATTACAGAGT GGGACAAGGC CACTGTGTCT CAGGAATGGG ATTTAACAAG 8220 

GGGCAGTACC TGTCACTGGA TACAGAGGTC GATGAGGAAA ATGCTCTGTC CCCAGAAGCA 8280 

TGCTACGAGT GCAAAATCAA CGGCTATCCT AAGAAAGACA GCAGGCAGAA GAGAAGTATT 8340 

CATGAACCTG ATCCCACTGC TGTTGAACAG ATCAGCCTAG AGAGTGTCGA CATGGACAGC 8400 

CCCGTCAACA TGAAGTTCAA CCTCTCCCAC CTCGGCTCTA AGGAGCACAT CCTGGAACTA 8460 

AGGCCCGCCA TCCAGCCCCT CAACAACCAC ATCCGTTATG TCATCTCTCA AGGGAACGAT 8520 

GACAGCGTCT TCCGCATCCA CCAAAGGAAT GGGCTCAGCT ACTIGCACAC GGCCAAGAAG 8580 

AAGCTCATGC CCGGCACATA CACACTGGAA ATCACTAGCA TCCCTCTCTA CAAGAAGAAG 8640 

GAGCTTAAGA AACTGGAAGA GAGCAATGAG GATGACTACC TCCTAGGGGA GCTTGGGGAG 8700 

GCTCTCAGAA TGAGGCTGCA GATTCAGCTC TATTAACCGT TCACAGACTT GGGCCCAGGC 8760 

TCAAATCCTA GCACAGCCAG TCTGCAGAAG CATTTGAAAA GTCAAGGACT AATTTTAAAG 8820 

AGGAAAAATA ATAATAACTC TTGTTTCTTT CCTCCCTGTC TTAGACTTTG AATGTTGACC 8880 

CTCACAGGGA GGGATAATTT AGACTCTGGT ATGGCCAAAG ATTTGAGCTC AAAGGCAACC 8940 
GTGGTTACTG TATTTTTTAT ATAACTTCAT TTTAAAATAT ATTAAAAGAA A 
TCAAGATATC AGCATATGGC ACTAAATGCA CAAAAATAAT G 
CCTGTTAGCA GTCTGTAACA CTTTGGGTAT TTTGCTATAG T 

GATGTTTATT TATTTTTAAT GCAGTAATAT ATGGAGAAAT G AACAAAC T A TGTAAACAAA 9180 

AAGGGAAACT CACTTGTTTT TCTTTAGATT TATAAATTTG AGCTATTTTT TTTAGAGGTG 9240 

CTTTTTAAAA ATCCAATAGA TACAAGAGAT GTTTCCTTTG GTTTTCTGCC AGTCATCCAG 9300 

CTGATACACA CCTGATCGAT TTTAAAGAAA GCCACACAGA GCTGAATCGG GCAGTGCTAA 9360 

TCAATAATTT AAAAGACATG AATGTCATTA GATCCTTTAT AACGTAGATC GAAG CCAAAG 9420 

CAGCTCATTT GTGACAACAT TTCATATCAC CAGACACACC AGGCAACAGA AGTTGAAGCA 9480 

CAACCACTGT AGCAAAATAC CTTGACTGCT TGTGAGACCA TTAGCATTGC AGGCCAAACC 9540 

GTACTGTATT TCCTTCTCAT AACCTCAAGG AACCATATGT GCTACCCACA ACACCTCATT 9600 
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CTTACCCAGG GTGCGCTGCG TCCTCATGGT ACTGTAGGCA GCTGAAGAAC CGCCGTTCCC 9660 
TTGAAAGGGA ACACCTGGCA TTCTGTGGTG TTTCGTGCTG TCTTAAATAA TGGTGCATTT 9720 
ATTAIGTTCA AGTTATTTCA GGATTGCCAT ATGTGCAAAC AA^TCATGCA ATGCA3CCAA 9780 
GGAATATATG TTGTTGTTGT TGTTTTAAAC CCATTTTTTT TTTAGAATTT TCATTAATAC 9840 
TGTAGTTATA CACCATATGC CTCATTTTAT CATAGCCTAT TGTGTATGAA AGATGTTTGT 9900 
ACAAIGAATT GATGTTTAGT TTGCTTTAGT CATTTAAAAA GATAT7GTAC CAGGATGTGC 9 960 
TATTAAGAGC ACGTATCCAT TATTCTTCTC AACCCAAGAA CCTGT7TCCT GGACCAGTGA 10020 
A TATGTGAAAT GGCCAAAGCA CATGCAGGCT CCTGG7TGTT CCTCTCAAAC 10080 
LGATTAG TAACCAGTTA TACCCAGTAT TTTGAGGTTT TATTGTTTTT 10140 
TTAATAACTA AAAAAAAACT CGTGCC 



PCT/US02/12476 



MGRRRRLCLQ 
DGFCSRPNMC 



I 



TTGHAWGHPC 
RCPAGHKQSE 
RTGMCFSGLV 



LWAQGTAGQP 
ASRVRRRGQQ DVLRGPNVCG 
TCSSGQISST CGSKSIQQCS 
IAQPCACVYG FTGPQCERDY 
EMCPAQPQPC RRGFIPNIRT 
TTQKCEDIDE CSIIPGICET 
NGRCAQELPG RMTKMQCCCE 



QPPPPKPPRP 



VRCMMGGTCA 
RTGPCFTQVN 
GACQDVDECQ 
GECSNTVGSY 



QPPPQQVRSA 
WKTLPGGNQC 
DDHCQCQKGY 
NQMCQGQLTG 
AIPGICQGGN 



TAGS EGG FLA 
IVPICRNSCG 
IGTYCGQPVC 



NPCTNGDCVN 



NQTIDICKHH 
GKNCVDKDEC 



AGFQRTPTKQ A 



VFRTETETCE 
IQDSRCEVNI 
FPGVCPNGRC 
MDACCCAVGA 
GMCTYGKCRN 



RCNCNSGYEP 



NGATLKSECC 
VNSKGSFHCE 
AWGTECEECP 
TIGSFKCRCN 
MMKNCMDIDG 



CDCPPGLAVG 
QPCPAKNSAE 
DASGRNCIDI 
VNGACRNNLG 
ATLGAAWGSP 
CPEGLTLDGT 
KPGTKEYETL 
SGFALDMEER 
CERNPLLCRG 



FHGLCSSGVG 
DECLVNRLLC 
SFNCECSPGS 
CERCELDTAC 
GRVCLDIRME 



PEACPVRGSE 
PGGNGFSPGV 
NMGYKQDANG 
NGVLCKNGRC 
ICKFGFVLAP 
MRSTCYGGIK 
ITVDGRDINE 
DNGLCRNTPG 



GGAGVGAGGQ 480 



3 CIHPVPGKFR 



NCTDIDECRI S 



GSYECSCSEG 
TCIDVWECDL 
ASCLNIPGSF 



YALMPDGRSC 
NSNICMFGEC 
KCSCREGW1G 
ECAENINLCE 
MFHCICDDGY 
VDNRVGNCYL 



EHTKGSFICH 
NGIKCIDLDE 
NGQCLNVPGA 



HTGKAVDIDE 
QRNADCINSP 
GFKASQDQTM 
GQVCRNGRCF 
ICPPGYEVKS 
SFCFTNFENG 



GRCVWIIGSY 



YLSEDTRICE DIDECFAHPG 
ELPFNVTKRM 
CKEIPGICAN 
GSYRCECAAG 
CMDVDECERH 
NEIGSFKCLC 
ENCIDINECD 
KCSVPKAFNT 
VNECLESPGI 
PCGNGTCTNV IGSFECNCNE 
GYALREDQKM 
CRTKPGICEN 
RNLVTKSECC 
NGQCINTMGS 
VLQEDGKTCK 
CGGKGICQNT 
GYIQHYQWNQ 
SKNPCNYGCS 
CYECKINGYP 
RPAIQPLNNH 
ELKKLEESNE 



Seq ID NO : 46 
Coding sequen 



FRCFCKVGYT 
DLDECQTKQH 
PGSFSCECQR 



NPITIILEDI 
VCGPGTCYNT 
CCCTYNVGKA 
GVCINQIGSF 
FKLSPNGACV 
PCGNGTCKNT 
NEGYELTPDG 
EDPNICLFGS 
TKAKCCCSKM 
CSNGQCINTD 
GFEPGPMMNC 
LHDCESRGMM 
RCECNEGFQS 
CELCPLPGTA 
TDISGTSCID 
NCQFLCVNTL 



QATPDRQGCT 
DICDGGQCTN 
CQLGYSVKKG 
CSNGTHQCSI 
YRCECEMGFT 
DIDECADPIN 
SCNTEIGVGV 
DECQELPGLC 
LGNYICICPP 
3NKPCEPCPT 
RCECPTGFSY 
DRNECLEIPN 
VGSYNCLCYP 
KNCIDTNECV 
CTNTPGGFQC 



BIGAHNCDMH 
YRCACSEGFT 
IDECSFQNIC 



DIBECMIMNG GCDTQCTNSE 
I PGE YRCLCY 
TTGCTDVDEC 
NAQCWPGS 
PASDSR3CQD 
CVNGLCVNTP 
SRSSCCCSLG KAWGNPCETC 
3FQCECPQGY 
CMDMRKSFCY 
GNIPGFTFDI 
DECSNGDNLC 
QGSYQCICHN 



EYMQVNCCHN 
PGTADFKTIC 
NDLLLVCEDI 
VCSHGLCVDL 



ALPGSCSPGT 



CKNLIGTFMC 



QYKKICPHGP 
LDECSQSPKP 
GGFTCKCPPG 



NTEGGYLCGC 
KKDSRQKRSI 
IRYVISQGND 



LCPKDDEVAF 
YNLDYTGVRC 
LLCALRCKNT 
ICPPGMARRP 
RQGLCFAEVL 
GYTTDGRDID 
CNYICKNTEG 
FTQHHTACID 
HRCQHGCQNI 
SGFSFDQFSS 



CQNLSGSFRC 
NGRRCFDTRQ 
QDLCPYGHGT 



NACGSASCYN 
PPGYYRVGQG 
HEPDPTAVEQ 
DSVFRIHQRN GLSYLHTAKK KLMPGTYTLE 
ALRMRLQIQL Y 



SYQCSCPRGY 
NNECGSQPLL 
LGGYRCGCPQ 



DEENALSPEA 
LGSKEHILEL 
ITSIPLYKKK 



GCGGCCGCAC TCAGCGCCAC GCGTCGAAAG C 
GTATGAGCCG CACAGCCTAC ACGGTGGGAG CCCTGCTTCT CCTCTTGGGG ACCCTGCTGC 



C GAGGACCCGC CGCACTGACA 



AGCACAATGA 
GGGGCCAAGG 
CCCTGCATGT 
AGCAGACCAT 
GCCAGTGCAA 
CCTGCTCCTT 
AACTACAGCC 



CTCAGAG CAG ACTCAGTCGC CCCAGCAGCC 



CCACGAGGAA 
CTCTTTCTAC 
CTGCAAGCCC 



AGGAAGTCCC 



GGATTAAGCC 
AGACCTAAAA 
TGCCTCCTGG 



3 TAAACATATC 



AAATACCTGA AGCGAGACTG 
GGCTGCAACA GTCGCACCAT 
ATCCCCAGGC ACATCCGGAA 
AAGAAATTCA CTACCATGAT 
AAGAAGAGAG TCACACGTGT 
AAATCCAGGT GCACCCAGCA 
CAACCAGATT CTTACTTGGC 
CAGGAGCCTG CTTGTGCGTA 
TTAGACACCA GAGAAAACAC 
TGCTTTAATG GGGATGTACC 



TGGCTCCAGG AACCGGGGGC 
GCTGGAGTCC AGCCAAGAGG 
GTGCAAAACC CAGCCGCTTA 
CATCAACCGC TTCTGTTACG 
GGAGGAAGGT TCCTTTCAGT 
GGTCACACTC AACTGCCCTG 



TGTCCTAGGA ATGCAGCCCC 
TTAAACCTAG AGGCCAGAAG 
GTTCGTGTGC ATGA3TGTGG 
AGTC7CTGCT AGAGAGCACT 
AGAAACCCAC CTCACCCCGG 
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CTCACATCTA AAGGGGCGGG GCCGTGGTCT GGTTCTGACT TTGTGTTTTT GTGCCCTCCT 9S0 

GGGGACCAGA ATCTCCTTTC GGAATGAATG TTCATGGAAG AGGCTCCTCT GAGGGCAAGA 1020 

GACCTGTTTT AGTGCTGCAT TCGACATGGA AAAGTCCTTT TAACCTGTGC TTGCATCCTC 10S0 

CTTTCCTCCT CCTCCTCACA ATCCATCTCT TCTTAAGTTG ATAGTGACTA TGTCAGTCTA 1140 

ATCTCTTGTT TGCCAAGGTT CCTAAATTAA TTCACTTAAC CATGATGCAA ATGTTTTTCA 1200 

TTTTGTGAAG ACCCTCCAGA CTCTGGGAGA GGCTGGTGTG GGCAAGGACA AGCAGGATAG 12 SO 

TGGAGTGAGA AAGGGAGGGT GGAGGGTGAG GCCAAATCAG GTCCAGCAAA AGTCAGTAGG 1320 

GACATTGCAG AAGCTTGAAA GGCCAATACC AGAACACAGG CTGATGCTTC TGAGAAAGTC 1380 

TTTTCCTAGT ATTTAACAGA ACCCAAGTGA ACAGAGGAGA AATGAGATTG CCAGAAAGTG 1440 

ATTAACTTTG GCCGTTGCAA TCTGCTCAAA CCTAACACCA AACTGAAAAC ATAAATACTG 1500 

ACCACTCCTA TGTTCGGACC CAAGCAAGTT AGCTAAACCA AACCAACTCC TCTGCTTTGT 15 SO 

CCCTCAGGTG GAAAAGAGAG GTAGTTTAGA ACTCTCTGCA TAGGGGTGGG AATTAATCAA 1620 

AAACCKCAGA GGCTGAAATT CCTAATACCT TTCCTTTATC GTGGTTATAG TCAGCTCATT 1SS0 

TCCATTCCAC TATTTCCCAT AATGCTTCTG AGAGCCAC7A ACTTGATTGA TAAAGATCCT 1740 

GCCTCTGCTG AGTGTACCTG ACAGTAAGTC TAAAGATGAR AGAGTTTAGG GACTACTCTG 1800 

TTTTAGCAAG ARATATTKTG GGGGTCTTTT TGTTTTAACT ATTGTCAGGA GATTGGGCTA I860 

RAGAGAAGAC GACGAGAGTA AGGAAATAAA GGGHATTGCC TCTGGCTAGA GAGTAAGTTA 1920 

GGTGTTAATA CCTGGTAGAA ATGTAAGGGA TATGACCTCC CTTTCTTTAT GTGCTCACTG 1980 

AGGATCTGAG GGGACCCTGT TAGGAGAGCA TAGCATCA7G ATGTATTA3C TGTTCATCTG 2040 

CTACTGGTTG GATGGACATA ACTATTGTAA CTATTCAGTA TTTACTGGTA GGCACTGTCC 2100 

TCTGATTAAA CTTGGCCTAC TGGCAATGGC TACTTAGGAT TGATCTAAGG GCCAAAGTGC 2 ISO 

AGGGTGGGTG AACTTTATTG TACTTTGGAT TTGGTTAACC TGTTTTCTTC AAGCCTGAGG 2220 

TTTTATATAC AAACTCCCTG AATACTCTTT TTGCCTTGTA TCTTCTCAGC CTCCTAGCCA 2280 

AGTCCTATGT AATATGGAAA ACAAACACTG CAGACTTGAG ATTCAGTTGC CGATCAAGGC 2340 

TCTGGCATTC AGAGAACCCT TGCAACTCGA GAAGCTGTTT TTATTTCGTT TTTGTTTTGA 2400 

TCCAGTGCTC TCCCATCTAA CAACTAAACA GGAGCCATTT CAAGGCGGGA GATATTTTAA 24S0 

ACACCCAAAA TGTTGGGTCT GATTTTCAAA CTTTTAAACT CACTACTGAT GATTCTCACG 2520 

CTAGGCGAAT TTGTCCAAAC ACATAGTGTG TGTGTTTTGT ATACACTGTA TGACCCCACC 2S80 

CCAAATCTTT GTATTGTCCA CATTCTCCAA CAATAAAGCA CAGAGTGGAT TTAATTAAGC 2640 

ACACAAATGC TAAGGCAGAA TTTTGAGGGT GGGAGAGAAG AAAAGGGAAA GAAGCTGAAA 2 700 

ATGTAAAACC ACACCAGGGA GGAAAAATGA CATTCAGAAC CAGCAAACAC TGAATTTCTC 2 7 SO 

TTGTTGTTTT AACTCTGCCA CAAGAATGCA ATTTCGTTAA TGGAGATGAC TTAAGTTGGC 2820 

AGCAGTAATC TTCTTTTAGG AGCTTGTACC ACAGTCTTGC AC ATAAGTG C AGATTTGGCT 28 80 

CAAGTAAAGA GAATTTCCTC AACACTAACT TCACTGGGAT AATCAGCAGC GTAACTACCC 2 940 

TAAAAGCATA TCACTAGCCA AAGAGGGAAA TATCTGTTCT TCTTACTGTG CCTATATTAA 3000 

GACTAGTACA AATGTGGTGT GTCTTCCAAC TTTCATTGAA AATGCCATAT CTATACCATA 30S0 

TTTTATTCGA GTCACTGATG ATGTAATGAT ATATTTTTTC ATTATTATAG TAGAATATTT 3120 

TTATGGCAAG ATATTTGTGG TCTTGATCAT ACCTATTAAA ATAATGCCAA ACACCAAATA 3180 

TGAATTTTAT GATGTACACT TTGTGCTTGG CATTAAAAGA AAAAAACACA CATCCTGGAA 3240 

GTCTGTAAGT TGTTTTTTGT TACTGTAGGT CTTCAAAGTT AAGAGTGTAA GTGAAAAATC 3300 

TGGAGGAGAG GATAATTTCC ACTGTGTGGA ATGTGAATAG TTAAATGAAA AGTTATGGTT 3360 

ATTTAATGTA ATTATTACTT CAAATCCTTT GGTCACTGTG ATTTCAAGCA TGTTTTCTTT 3420 

TTCTCCTTTA TATGACTTTC TCTGAGTTGG GCAAAGAAGA AGCTGACACA CCGTATGTTG 3480 

TTAGAGTCTT TTATCTGGTC AGGGGAAACA AAATCTTGAC CCAGCTGAAC ATGTCTTCCT 3 540 

GAGT CAGTGC CTGAATCTTT ATTTTTTAAA TTG AATGTT C CTTAAAGGTT AACATTTCTA 3 600 

AAGCAATATT AAGAAAGACT TTAAATGTTA TTTTGGAAGA CTTACGATGC ATGTATACAA 3 660 

ACGAATAGCA GATAATGATG ACTAGTTCAC ACATAAAGTC CTTTTAAGGA GAAAATCTAA 3 720 

AATGAAAAGT GG ATAAA CAG AACATTTATA AGTGATCAGT TAATGCCTAA GAGTGAAAGT 3 78 0 

AGTTCTATTG ACATTCCTCA AG AT AT TT AA TATCAACTGC ATTATGTATT ATGTCTGCTT 3840 

AAATCATTTA AAAACGGCAA AGAATTATAT AGACTATGAG GTACCTTGCT GTGTAGGAGG 3 900 

ATGAAAGGGG AGTTGATAGT CT C AT AAAAC TAATTTGGCT TCAAGTTTCA TGAATCTGTA 3960 

ACTAGAATTT AATTTTCACC CCAATAATGT TCTATATAGC CTTTGCTAAA GAGCAACTAA 4020 
TAAATTAAAC C 

Seq ID 



QCNSFYIPRH IRKEEGSFQS CSFCKPKKFT TMMVTLNCPE LQPPTKKKRV TRVKQCRCIS 180 



I I I I I I 

ATGAAAGTTG GAGTGCTGTG GCTCATTTCT TTCTTCACCT TCACTGACGG CCACGGTGGC 60 

\ AAAATGATGG CATCAAAACA AAAAAAGAAC TCATTGTGAA TAAGAAAAAA 120 

; CAGTCGAAGA AT AT CAGCTG CTGCTTCAGG TGACCTATAG AGATTCCAAG 180 

3 ATTTGAGAAA TTTTCTGAAG CTCTTGAAGC CTCCATTATT ATGGTCACAT 240 

GGGCTAATTA GAATTATCAG AGCAAAGGCT ACCACAGACT GCAACAGCCT GAATGGAGTC 300 

CTGCAGTGTA CCTGTGAAGA CAGCTACACC TGGTTTCCTC CCTCATGCCT TGATCCCCAG 360 

AACTGCTACC TTCACACGGC TGGAGCACTC CCAAGCTGTG AATGTCATCT CAACAACCTC 420 

AGCCAGAGTG TCAATTTCTG TGAGAGAACA AAGATTTGGG GCACTTTCAA AATTAATGAA 480 

AGGTTTACAA ATGACCTTTT GAATTCATCT TCTGCTATAT ACTCCAAATA TGCAAATGGA 540 

ATTG AAATT C AACTTAAAAA AGCATATGAA AGAATTCAAG GTTTTGAGTC GGTTCAGGTC 60 0 

ACCCAATTTC GAAATGGAAG CATCGTTGCT GGGTATGAAG TTGTTGGCTC CAGCAGTGCA 660 

TCTGAACTGC TGTCAGCCAT TGAACATGTT GCCGAGAAGC- CTAAGACAGC CCTTCACAAG 72 0 

CTGTTTCCAT TAGAAGACGG CTCTTTCAGA GTGTTCGGAA AAGCCCAGTG TAATGACATT 780 

GTCTTTGGAT TTGGGTCCAA GGATGATGAA TATACCCTGC CCTGCAGCAG TGGCTACAGG 840 

GGAAACATCA CAGCCAAGTG TGAGTCCTCT GGGTGGCAGG TCATCAGGGA GACTTGTGTG 900 

CTCTCTCTGC TTGAAGAACT GAACAAGAAT TTCAGTATGA TTGTAGGCAA TGCCACTGAG 960 

GCAGCTGTGT CATCCTTCGT GCAAAATCTT TCTGTCATCA TTCGGCAAAA CCCATCAACC 102 0 
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ACAGTGGGGA ATCTGGCTTC 
GCCAGCCATT TCAGGGTGTC 
ATCCTTAATT CAGCCTCAGT 
AGCTCACGGT TACTAGAGAC 
CCTCTGAATT TTTCTCGGAA 
CTCAAAAGGG GTTACAGCTA 
AGAGGCCGTG TGTTAATTGG 
AGCATGGCCT CGTTGACTCT 
GTCAATGGAC CTGTGATATC 
TTTTTTTCCA AGATAGAGTC 
CATTTGCAGT GGAACGATGC 
TGCCAATGTA CTCACTTGAC 
ATCTTCCCCG TTGTAAAATG 
ATTTTATGCC TGATCATCGA 
CACACACGTC GTATTTGCAT 
TTTATTGTTG GTGCCACAGT 
GTGTTCTTTA CACACTTCTT 
CTGCTGGCTT ACCGGATCAT 
GTTGGATTTT GCCTGGGTTA 
ACGCAACCTA GCAATACCTA 
AGCAAACCAC TCCTGGCTTT 
GTGGTGCTGC TAGTTCTCAC 
GATGACAAGG CCACCATCAT 
GGGCTCACCT GGGGCTTTGG 
GTTATTTTTG CTTTACTCAA 
TTGGACAGTA AGCTGCGACA 
CAAACAGAAA AGCAAAACTC 
AACCCACTGC AAAACAAAGG 
ATCATGCTAA CTCAGTTTGT 

Seq ID NO i 463 Proteir 



CAATTCAACA 
AACCAACTGG 
ATTAGAAAAC 



TCAGATTAAA 
GTCAGACCAA 
GGGGAACATT 
CACGGTTATT 
AAACCTGAGC 



ATTCTGAGCA 
ATGGAGGATG 
ACAGTCTTAC 
ATCAGCACTC 
TGGAAAGGGA 
ATGTGTCCCC 
TTCCAGAGAT 
CTACCCGTTT 
CAAAACTATT 
CAGCCTCATT 
CTAGTGAATG 



TTCCAGTGAA 
AAAATACATC 
CCCTTCCAGA 
CCAAAAATGG 



GATCACCTAT G 

GGTGAACATA G 
GGACACCACG G 
CTACCTCTCT T 

TGGGTGCCCT 



G GTATCTCCAT 
A TTAAAAAAAG 
C TCTTGATTGC 



AAAGCTCTGG 



CTCATTATAT 
GATGTGTGTT 
GCACTGGCTA 



AAGAGCCTCC 



GGATGCTCAT 
CCCAGCATTT 
CTGTCATTAC 

TTGTGGCTGT 



TCTGTCACTC- 
AGCTGACAAT 
AAAGTATGCC 
GACAGCTCTT 
CAAAAGCCAA 
TATTCCCATC 
AACTATTATC 
AAATGCTCAG 
AGTTTTCCTA 
GGATTTCAGT 
CATCGTGACG 
CCCCTCTACA 
TGGAAGTCTC 
CCAAACCTCT 
TGATGTCTGG 
CACAGCTGCT 
GCTTGGCATC 
GATGATGGCT 
CATTGCTGTC 
GTCCAATGGA 
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GGATTTTTTA 
TCTGCCAAAC 



ACTGAGTCGG 
CCCTCTGCTA 
GGCTTGGCAT 



CTCAAATGAA TAA 



CTGCCTTAAG T 
CCAAATTCTC AAAGCCTTTC 
CTGGAGATTC CTCCGACAAC 



MKVGVLWLIS 
EKRDLRNFLK 
NCYLHTAGAL 
IEIQLKKAYE 



LLKPPLLWSH 



ASHFRVSNST 
PLNFSRKFID 
SMA3LTLGNI 
HLQWNDAGCH 
ILCLI IEALF 

TQPSNTYKRK 



RIQGFESVQV 
VFGKAQCMDI 
FSMIVGNATE 
MEDVISIADN 
WKGIPVNKSQ 
LPVSKNGNAQ 



I 

~ FLGKNDGIKT 
GLIRIIRAKA 
SQSVNFCERT 
TQFRNGSIVA 
VFGFGSKDDE 
AAVSSFVQNL 
ILNSASVTNW 
LKRGYSYQIK 



KKELIVNKKK 



KIWGTFKINE 
GYEWGSSSA 
YTLPCSSGYR 
SVIIRONPST 
TVLLREEKYA 
MCPQNTSIPI 



HLGPVEEYQL 
LQCTCEDSYT 
RFINDLLNSS 



RGRVLIGSDQ 



LLQVTYRDSK 
WFPPSCLDPQ 
SAIYSKYANG 
AEKAKTALHK 
GWQVIRETCV 
ILSNISSLSL 
ISTLVPPTAL 
FQRSLPETII 



LVNETODIVT 
WKQIKKSQTS 
LFFWMLMLGI 

DVCWLNWSNG SKPLLAFWP A 
3 KSLLILTPLL GLTWGFGIGT I 



IFPWKWITY VGLGISIGSL 
FIVGATVDTT VNPSGVCTAA 
VGFCLGYGCP LIISVITIAV 
WLLVLTKLW RPTVGERLSR 
VIFALLNAFQ GFFILCFGIL 
NPLQNKGHYA FSHTGDSSDN 



Seq ID NO: 464 DNA sequence 

Nucleic Acid Accession #: AB035089.1 

Coding sequence: 984S. .10219 



GGGCATGCAG 
CAGTTCTAGT 
CCAAGAGGAA 



AGGAGGAAGG 



AAGAAAGGAA 
GTAGACAGAA 
AATCTCCTCC 
AATAAAATGT 
AAGGAAGTGA 
GTAATGACAG 
GGATTCCTGG 
CAAGTGTTCA 
TCAGGATATG 



I 

~ CCATCGGGGA 
AAAAGGGAGA 
TTAGGGAGAG 
AGCATACAGT 
GAGGCAAGAG 
CCCTCTCTGC 
AGCTAGTTAG 
TCCTTGGGAA 
ACTAACCAGT 
TCTCTTGACT 



I 



I 



ACATCAATAT 



GCAAGGAGGA A 
TTAGCAATAG A 
TCAGCAAG GGGACAGGGT TAGATTTGGT 
AAATATGATG TCTGTCCCTG GCAGTGTTGG CAGAGTAGGA 
ATAATATCAT TTTCTCTGTG CTC CAACTGT ACTTACATAT 
TTTTCAAACC TTACTGGAGT TGTTTTCCCT CATGAAAACC 
TCTTGTTCTG AGGTTGTTCA ATGTATACAT ATCTATATCT 



CAAGCTTTCT 
TTCAACCTTC 
CCATAGATTG 
CCTCCATTCC 
TGGTACCCGA 
CTGGATGCAG 



GATATTTCCT 
AGCCAATGAA 
TATGCAAAAA 
TCCAGTCTCA 
ATTTAGGAAC 



TTCCCTATAG 
AATCTAGGAG 



AGGGCAAACC 
GTCCCCTGTA 
CAGGATGAGC 
GTAAATCCAT 
ACTCAGCTGA 



AGAAAAAAAT 
GCAAACTCCT 
TCCGTGCCTC 
ACCCCGGTTT 
TTGTTGCTTC 
CCTACTCCAA 
GAAGACCATT 



ATTGCCACAA 
ACAATGCTGA 
ATATTTCTTA 
TTTCCCATTG 
GTTTATGAAA 
TTTCTGAGTT 
TATGTCCTTT 
GCCTGAAATG 



TTTAATTTCT 



AGACGTTTAG 
GTCTCAGCTT 
TGTCCTATGA 
TAGAGGAAGG 



CTATTCCCCG 
ATATATATTG 
GTGAGAAATT 
CCTATGTGTT 



AGGATTTGTT 
TATCAAGAGA 
CTCTGTGGCA ATATATGACA 
CTAGCCTGTC TAT CACATGC 
ATTTCTCATT TGAACTCATC 
AGTTAGTACC TTTCCTTAAG 
CCATAGTCTG AAATTCTCTT 
GTTATCCTGT TTTTTTCTTC 
GACATTAGAT TCCTTTTCTT 
TCCATTTTTG ' 
GGAATTCTTT 
TCCATCAACG 
CTCATTCAGA ( 



TGTGGAGAAA 



TCTGGCACCT 
CACGGCTGGT 
TCAGAACTCT 



GGATTTTTAG T 
GGGAACTTAT ACCTCTTAA 
CCTAAGAAGA GCATGTCTTG G 
AAACACAGCT TCCTTTCTCC CATCCAGCCC 
TGTTGTAGAT AAATCTCCCT TGACTTTGTG 
GTTAAAAAGG GCCCATGACA ATACCAAGTG 
ATTCACGGTC GGTTGGAATG CACACTTGTG CAGAATTCTA 1740 



ATCTCAGATA 
CCCCATTAGT 
GACTCAAAAC 
AAACAGGCAG 
CTACTTTCAG 
ATGTGCTGAG 
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TGGAGAAGAG TCTGGCATTT CCTCAAAATG TTAACCTGGA TTTACCATAT GACCCAGCGA 1800 

TTTCATTCAT AGGTTTATAC TCAAAAGAAA TGAAGAAATA TGCCATGCAA AAAAATGTAC 18S0 

ATGAAAGGTC ACAACATCAT TATTCATAAT AGTAAAAGGA 7GGAAACAAC ACAAATGTCC 1920 

ATCAACTTAT GATTAAAGAA AATCTGGTCT ATTCATAGAA TGGAATATTA TTCGACCACA 1980 

AAAAGGAATG ATGTACTGAT CCATGCAATG ATGTGGACAA ACCATGAAAA TAACACTAGA 2040 

TTAAAGAAGC CAGTCACAAA AGGACTTACT GTATGATTCC ATTTACCTGA AATGTTTGGA 2100 

ATAGGCAAAT CCATAGAAAC AGGAGGTAGA TTCCTGGTTT CCAGGGTCTC CAGGAAGGGA 2160 

AGAATGAAGT ACAAGATTTC TTTTGGAGGT AGTGAAATTG TTGTGGAATG AC-ATCATGAT 2220 

GATGAT AG CA CAACTTTGTG AATATAATAA AATCATTGAA TT3TACAGTT GAATTTATGG 2280 

TATATAAATT ATATGTTAAT AAAAAGGGGG TCCACAAAAC AAACAGCCCC CCACTCTGGT 2340 

TGTCAGGGAG ATATTGGATT AAATGGCCTT GGACAACAAC CCCTCTCCCT GGCCACAGAC 2400 

ATICTTCAGA TTACAAGATA TTCCAGGGGA AACACTGGAA TGAGTCTGAA GCCAGGTGCT 2460 

AAACAGAAGG ACCATTGAGA AATGTTGTGA TCCTGACAGG TCAAGCAATT TATTTTTCGG 2520 

CTTCATTTTT AAATGTAAAA TTAGAAAGCT GCCATTTAAA ATGGCCCGTC TGTTTCAATT 2 580 

GCTCTTCTCA GTGTCAGCCT GTTAACTCAA TGTGTTAGTC TGTTTTCATG CTGCTGATAA 2 640 

AAACATACCT GAG ACT GGC A AGAAAAAGAG GTTTAATTGG GCTTAGAGTT CCACGTGATT 2 700 

GGGGAGGCCT CAG AAT CAC A GTAGGAGGCA AAAGTTATTC TTACATGGTG GCTGCAAGAG 2 760 

AAGATGAGGA AGAAGCAAAA GAAGAAACCC CTGATAAACC CATCGGATCT CCTGAGGCTT 2820 

ATTAACTATC ATGAGAATAG CACAAGAAAG ACCGGCCCCC ATGATTCAAT TACCTCTACC 2 880 

TGGGTCCCTC CAATAACATG TGGAAATTCT GGTAGATACA ATTCAAGTTG AGATTTGGGT 2 940 

GGGAACACAG CCAAACCATA TCACTCAGCA AGGCAGATAA CTTTCTCACT GAGCCTATGC 3000 

AACAGAAAAC CATCTGGGAT GGTTGTAAGG GGCACAGGAA GTGACTGGTA GGATCACTGC 3060 

CAAAGCTGAG CACTCAGGAG AAGGCAATAG AATCCTATTC TCCATAGTAT GCTATAAGAT 3120 

ACTGAAGTAC ACTTCTTCAC TATCTCTTTG GACTTAGAAT TAGCACTACA TTCCTTGTTA 3180 

TACAGAAAAA TTACTAAGGA AATTCATAGG ATGACAAAAA CTTTCAGAAC TGAAAAACAG 3240 

GAAATGTAAG CTTTTTAGTT CTTTGGTATT CGAAGTATGC CTAAAAGACA ATGCAAAATC 3300 

CAAGAAAAGA ATGGTGGGGT TTTTGTTTGT TTGGTTTTGT TTTTGTTTTA CAGCTGGAGT 3360 

AGAATACAAA GGGATGGAGT TGAAACAAAT GAGAGGAAAT TGGAATTCTA AACTTATTCT 3420 

CATTGGCATT AGAAAGGCAC CTACATGTAT TTCACATGAG CCGGTGACTG CTGACTTGCA 3480 

TTCTTATTTT TTCCCTATAG ATTAAAAAGG AGGTACAATG GTAGAACTGT AATCCTGTCC 3540 

TTTGTCATAA ATTTTCATAT TCATAAAGGT GAGTGTTAGC CCGCTTGTGA AATCTGAAGT 3600 

TGAGTAACTT CAAATACTAA CCACAGAGGG AAAGGCAGCA AGAGGAGAGG CATAAATTTA 3660 

GGATCTCACC CTTCATTCCA CAGACACACA CAGCCTCTCT GCCCACCTCT GCTTCCTCTA 372 0 

GGAACACAGG TAAGAGCTTC AAGCCTCTCC AGCTTAATAA CATGAATTAT TTTTGAGAAT 3780 

AATAATGATA CTGTGTTCTA TATCATGCAT CTCCTGCATT CTGTCTGATT ATATTTTACT 3840 

TATTCTGCCA GAGCAAAATT AAAATACCTA TTTCATCTGA TTTGTCCTTT ATCTAAATTG 3900 

CTTAGTTCCA AGTAAACCAA GGCACTTTTA GGAACACAGA GGGAGAGTGC CTTGCAGCCA 3960 

GAGAGTCTTG AAGGAGATGT CAGGGACGCA TCTTAACAGC TGGTTGGATG TGATCCACAG 402 0 

AGGTCTCCTG TTAGCATTCA TTGTAAAGCC ATCCTACCTA GCTCTAGTGT AACCAGCAAT 4080 

GAAAGAAAGA TAAAGAGGGT CGATTACTTA TTTACAATAG TCTTTAAAAA CGTAGTTTTG 4140 

TAAGCCTTCT AATTAGGACA TTAATATATT T AAT AT ATG C ACATTGTAGA AAGATTGAAG 4200 

CGTTAAAAAT AAGAGAAAAA CTTTAAATGT CAAAATCTCA CBACCCAGAT ATATCATTTC 4260 



TTTAAGAAAA TTGTACTACA A 

TGATGCTTTT CCAGGAGTTC CAGATCACAT CGAGTTCACC ATGAATTCAC TCAGTGAAGC 4380 

CAACACCAAG TTCATGTTCG ATCTGTTCCA ACAGTT CAG A AAATCAAAAG AGAACAACAT 4440 

CTTCTATTCC CCTATCAGCA TCACATCAGC ATTAGGGATG GTCCTCTTAG GAGCCAAAGA 4500 

CAACACTGCA CAACAAATTA GCAAGGTAGC TATCAGCATC ATTACGTTGT CCTGTTGCAG 4560 

TTTTTCTCTG GTTCCGTCGG CTAGCACGCA GATGGTAATA GATGTGGTGG TCTGATGGGT 462 0 

AGCACAGGGG GCTGTGCAGG AATTCCCATA ACTGTGAGAC CACTGACTTA AACAGATCTT 4680 

TTGAGTAAAG TTTTCTTGTC CCGCTTCATG TCTCTTCCAG GTTCTTCACT TTGATCAAGT 4740 

CACAGAGAAC ACCACAGAAA AAGCTGCAAC ATATCATGTG AGTCACAGAG CACTCTGATT 4800 

CAGCTTTAGA TCCCTGAACA GGTCATAGTT TAAACCTGGA ACTTCACAAA AACTAAGAAA 4860 

AGGCCAGTTT TAGGGAAAAT CTTGGACACA AAGATTGAGA CATACAGAGT GGGTTGGCAT 492 0 

TTCATGGCAC ATAATTATTA TTCCTCATTT CTGCGTTACT AAAAGACAGT CAGCACTGTA 4960 

CCTCAGAGCA TAGGTCTGGA TCAGGATAGG CTGGGTTCAG ACTCCAGCTT TGCTCTTCAC 5040 

AAATGATGAA TAAGAGCAGG ACACAACTGC TCGGAGTCCC AGTGACCTCA TCCCAGAAAA 5100 

CTAAGGGTAA GAAAAAATCT G ACT CAAT AC ATGCAAATAC ATGCAAATGT TTACAACAGT 5160 

GCCTTGCCCA TAAAAGTCAT AATAAATGTT ATTATTATTA TAAAGTAGCT ATAATTATAC 5220 

T AAT CAT AAT AATGTGAAAA TAATTTAATT TTCATTGAGT CATTAATGAG ATT CAGAGGA 5280 

ATAAGCACAA GTCCAAGTAT ATTTTGGAAA ATGATTGCTA TGGAATATAT TGGTTTAGAG 5340 

CCTTAATAGT GCAAAATGCT TTGCTGGAAG GTAGAAAGTT CTAGATTTAA ACAGGCTTAG 5400 

GTTCAAAACT TGGCACTTCT AATTTATGTC TCTATAAACA GGGTTTTTTT CCCCATTCTC 5460 

TGAGCTTTCT TGTGTTCATC TGAATTGAAC TAAAGACTTA GAGTTACCCA TGTAAAGTCC 5520 

TTAGCCATGG ACCTGGCATA CACTCTTCTT ACGTGCAGAG AATGACCATC ATGAGGAAAG 5580 

AGCCACAGAT CAGTCAATGT GTCCTACAAG ATAATAGCAC CAACAGGTAT AACAGGGCTT 5640 

CCTGGCATAA TCTATTTAAA ATATCCAACC TTCAACATAC TCGTATCCTT GATGACTGTT 5700 

AGAAGTGAAA TATGGTCCTT GCCCATAAGG AGCTGAGAGT TTAACTGGGA AGCTAAACCT 5760 

AACCCTTTAA ACCAACAAGG AGAAAATCTA CTGGTAGACA GCGCTGCATC TTTAGTTCAG 5820 

AAGAGAAAAG ATTGCAGTAC GTTAGAGCAA GAAGAATTTT CTGGAAGAAG TCAAATATAA 5880 

GGTGGATTTT GAAGGGTATT TGAGGTGAAA TACACCAATT ATCAGGGAAT AACATCAAAG 5940 

GTCCTCAATG AGACTACCAG CATTTAGGGA CTGATCTAAC AGACTTAGCA TGGGTTTAGT 6000 

ATTTACATTG ATACAGCAAT TGAATGATCT CCTTTTTTGA TGTTTGAAGG TTGATAGGTC 6060 

AGGAAATGTT CATCACCAGT TTCAAAAGCT TCTGACTGAA TTCAACAAAT CCACTGATGC 612 0 

ATATGAGCTG AAGATCGCCA ACAAGCTCTT CGGAGAAAAG ACGTATCAAT TTTTACAGGT 6180 

AATTTCACCT GGCCTACCCA CATTTCATTT GCATCCTGAT GTCTGTGTCT CTGAGTGGCC 6240 

AAATGGAAGA AAGCAAGGCA GATGAGCCTG GCCGACCCAG GTGGAGAGCA TTTACTCAGA 6300 

GTGCATTAGC TCCATTTCCA CAACTCTCCC CCACTGGAGT GTCCCAGACC CCAACGATAC 6360 

AT CACTGAAG TGTGGATTTA GGGATAATCT TGTGATAAAA GAGGAGGTTG TGTAATAGAG 6420 

TGAGTAAGAG TAATAAGTAA TAAGATACCA TCGATAAACT GGCACTGACT CAGTCACATA 6480 

CGATACATCT TGGTGGGAAA TGTATGACTA ATGGGATATT ATTGGAATGG GCAGGCTTGG 6540 

GTGAGTTCCT GAGAATAGTT GAGGAAGTAC CAGGAAATAT TGAATGCACA GGATGAAAGA 6600 

CAAAAACAAA GATCAGAAAC ATCATGGTTA AAATTACTGG AGAGAAGTCT GAGAAGCAAT 6660 

GAATCTCCTT CAGGGAAGCC TGCTCTGCAG TTTGCAAACC ACAC =TCTT '"".TCTGC 6720 

CTTTTGCCAA GATGATATTG ACCTTCAGTG ACCTCTTTCT TGTG CAGCC i Cf TCCCC 6780 

TTTTGCATTG CCTACATGAC ACCTGTATAA AAAT AT C CAT GGACAGGAGA TACTGCATCT 6840 

ATTCAGGGTC TGGATTCAGC TTACTGTTGT TACAAATAAG TAAGTTTGGT AATATATAGT 6900 

TACATAAATT ACTCCTAATT CCTACTTCTT CCTTCATATC TCAAAGGAAT ATTTAGATGC 6960 
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CATCAAGAAA TTTTACCAGA CCAGTGTGGA ATCTACTGAT TTTGCAAATC- CTCCAGAAGA 7020 

AAGTCGAAAG AAGAT T AACT CCTGGGTGGA AAGTCAAACG AATGGTAGGA GAGCCACCCA 7080 

TTATAGAAAC ACCTTTGAGA AACCTATGCC AGTGAGCCTT GTGCTTGACA CTGCATGGGG 7140 

GAACAGGTGT GGGGATTGAG ATGGGTTTGC AGGGAGGGCT C-AAGAC-GGCA CTCCAGATGA 72 00 

AGGATTTGTC CAAATGAATA TGAAGAGAGC CTAGGGGAGC CAAGGAGGAA ATCACAGGAA 72 60 

GCCAATTAGA TGGAAACACA TCTGGAGAAT TATTTGCTTA TGGCCCTGCA TGACAATAGC 7320 

TTTGTGGATC CCCTGTCTCC GCTCAGACCT ATTTTGAGAT CATATCCTTT ACTTTAAATC 7380 

AGACTCAAAT TTTTATGATG AATATTTAAT AG AAAACAT T AGAAAGCGTC TCTCGTCTCC 7440 

TTTACTAATT GGGAAACAAG CAGCTCTCTG GTAAATCACC CTTTIGTCTC TGAGCTGGAG 7500 

CTGCCTGGAT CACATCTGTA GCCAATGTGT TCTGCAGGGA TTATCACAGC TCTCTTCCCC 7560 

ATCAAGGGCA AAGAGCTTGA CAAAGTCTCC ATT CT ACAG A CATCTTTCTT ACCTCCCACC 7620 

TCTCATTACA GGCCAAACTT ACAGCAACTC AACATGAGAG TGAATAGGAA GATACCCCCG 7680 

GAAGTAGTGT CTGACAGCAC AGGACATGCG TTTCATATTA CAGAGCTCAA GTCACTCATC 7740 

CTAAAATGCA ATCAGGGCCT CCTTCCTCTG AATGGGGACC CCGTAGTTAA AAAAAAATAA 7800 

AAGTAGGAAG AGGAGGGAGG GAGAAAGGAA AGACACATGT TGGAAGAGTA GACAAAATCA 7860 

GTTTATCAGT ATTCCAAATC AGATGATTGG AGACATTCAT ACACAGAGAA CGTGAACTCC 7920 

TTCTCTATCA CAAGAAGTGA TGTCTCCATC AAGGGTAACT TTATACGACT GGAGCCTTGA 7980 

AGAAAGCTGC ATCTGGTGAA CCACTGGTCA GTGAGTCTAA CAATTCAAAG ATCAAAGTCA 8040 

GTGAGTCTCA AGCAGGGATT TGGGTCAATA ATTAACGATC AGTCACGAAC ATTTGCAAAG 8100 

CATCTTCCAG ACAAGCCATT TGTAGCTTGT GTAAAAGACT CTTTTATTCT TTCCCTTGCA 8160 

GAAAAAATTA AAAACCTATT TCCTGATGGG ACTATTGGCA ATGATACGAC ACTGGTTCTT 822 0 

GTGAACGCAA TCTATTTCAA AGGGCAGTGG GAGAATAAAT TTAAAAAAGA AAACACTAAA 82 80 

GAGGAAAAAT TTTGGCCAAA CAAGGTATTG TCTATATTTT ATTTATATAG TGTAATATGT 8340 

TAATACATGG AATGTTAAAC ATTTCTGATG GAATGTAACA TGATAAGTAA AAAATAAAAA 8400 

TTGTTCATGT CTGTTATTTT GTTGTTTTAC TCTTATAACT TTATTTAGTT AGGAATACCT 8460 

GAAAAACTAT TGTTTCTAAC TCATGGAATT CCTGGGTTAT TTCTTAGAAG AAGAAGGATG 8520 

TGTTGCTATC TCAATAATAT TATCTTTTTT GTCTTGTGTT TCACGTGTTA TTTGTTGGAC 8580 

ACATTGATTT ATTGCAGAAT ACATACAAAT CTGTACAGAT GATGAGGCAA TACAATTCCT 8640 

TTAATTTTGC CTTGCTGGAG GATGTACAGG CCAAGGTCCT GGAAATACCA TACAAAGGCA 8700 

AAGATCTAAG CATGATTGTG CTGCTGCCAA ATGAAATCGA TGGTCTGCAG AAGGTAAGAA 8760 

CTTGCATCTA CAACTCTTCC TTCTACTGCC GGACATTTTT CCAAAGATAC CAAGTTTAAA 8820 

CAAGGTAAAA GCTTATGACC GAGTTGCCTC AAAATGATGA AAAATTCTAA ATGAGGAATG 8880 

ATGACTCACC TTCATATTAC AAATATTTGA GCATAGGGCC TGACACAAAC TGAAAGCTTA 8940 

GTTTTTGTTT GTTTGTTTGT TTTTATTATT ATTATTATAA TACTTTAAGC TTTAGGGTAC 9000 

ATGTGCACAA TGTGCAGGTT AGTTACATAT GTATACATGT GCCATGCTGG TGTGCTGCAC 9060 

CCATTAACTC ATCATTTAGC GTTAGGTATA TCTCCTAATG CTATCCCTCC CCCCTCCCCC 9120 

CACCCCACAA CAGTCCTCAG AGTGTGATGT TACCTTCCTG TGTCCAAGTG TTCTCATTGT 9180 

TCAATTCCCA TCTATGATTT AATTCCATCT ATGGCTTAGT TAATGATTAA TTTATTAGAG 9240 

TTACATGCAT TGGATATCAA TTTGATGATA TTATTATGCA GCAATTTAAA CTTGACTGGG 9300 

AGAAATATAT ACCAATGTGA GGAAAGTTTA CAAATAGGCC GAGTAGAAAA GGGAATACAA 9360 

ATTTAGGAAT TTAGGGAATT ACAATTTAAT AATTGCAATG TGTACTAAAT AATGTATACA 9420 

GAAAAATATG ATGAGCCTAT TAAAAATTGA CACATGTAGT AGGCTGTTGG CACAAGAAAT 94 8 0 

AGTGATACAT ACAGTTCATT GTGTACAAAA TAATGTAATC ATATTTTACA TGTGTATCAT 9540 

ACAGTTGTAT ACATACATAT GTACACATAT ACATATACGT AAAAACATGA TTCTGTTTTT 9600 

ACATACATGT ATATACATAT ACACATATAA CCCAATGTAT TTATATATTC AGGACTCATA 9660 

TTTTACCTAT TAGAATAATA ATGTCTATTA AAGTGAACCT TCTGTATTTC ACATTTATTG 9720 

CCAAAATAAC GAATCTCCAC ATAGTCAATT CATTGTTAAG GTGTATTAGA GATCGACAGT 97 80 

TAGTCATATC AGTTTCTTTT TTCCATTTGT ATAGCTTGAA GAGAAACTCA CTGCTGAGAA 9 840 

ATTGATGGAA TGGACAAGTT TGCAGAATAT GAGAGAGACA TGTGTCGATT TACACTTACC 9900 

TCGGTTCAAA ATGGAAGAGA GCTATGACCT CAAGGACACG TTGAGAACCA TGGGAATGGT 9950 

GAATATCTTC AATGGGGATG CAGACCTCTC AGGCATGACC TGGAGCCACG GTCTCTCAGT 10020 

C CTACACAAGG CCTTTGTGGA GGT CACTGAG GAGGGAGTGG AAGCTGCAGC 10080 

r GTAGTAGTAG TCGAATTATC ATCTCCTTCA ACTAATGAAG AGTTCTGTTG 10140 

T TTCCTATTCT TCATAAGGCA AAATAAGACC AACAGCATCC TCTTCTATGG 10200 

CAGATTCTCA TCCCCATAGA TGCAATTAGT CTGTCACTCC ATTTAGAAAA TGTTCACCTA 10260 

GAGGTGTTCT GGTAAACTGA TTGCTGGCAA CAACAGATTC TCTTGGCTCA TATTTCTTTT 10320 

CTATCTCATC TTGATGATGA TAGTCATCAT CAAGAATTTA ATGATTAAAA TAGCATGCCT 103 80 

TTCTCTCTTT CTCTTAATAA GCCCACATAT AAATGTACTT TTCCTTCCAG AAAAATTTCC 10440 

CTTGAGGAAA AATGTCCAAG ATAAGATGAA TCATTTAATA CCGTGTCTTC TAAATTTGAA 10500 

ATATAATTCT GTTTCTGACC TGTTTTAAAT GAACCAAACC AAATCATACT TTCTCTTCAA 10560 

ATTTAGCAAC CTAGAAACAC ACATTTCTTT GAATTTAGGT G AT ACCT AAA TCCTTCTTAT 10620 

GTTTCTAAAT TTTGTGATTC TATAAAACAC ATCATCAATA AAATAATGAC AT AAAAT CAT 10680 

TTTTGCTTTA CCTGTTTTCT CTCTGGAAAG GGCAAGTGTC CAGTTACACA TAGGAAAGAT 10740 

AATTTAGAGA TATATTAATC ATATATAAAG GAAAATTAAA AACAGAGTAG TTCATGATGA 10800 

GCCTGGAGTA GAAGGCATAT CCCAGAACAG GAGGAGCCTT GTAAACCACA TAGGAACTT C 10850 

CTATTTTATG CTAAAGGGAT AAGAAACTCA TTACAGGCTT TGATGGTTGT TTGTCAAAGA 10920 

GGGGCATAAA ATTATCATAT CCACATCTAG AAAAT ACAT C TCTGGCTACG CTGATATCAA 10980 

TGGATGCGAG GAAAGAACAG TGTGGTTACC ATATATAAAT TAGGAAATCA TTAGAGTATT 11040 

GGGAGTGGAA ATGGAGAGAA AGAAAGAGCC TGGGGGAATT ATTTAGGAAA TAATAGTTAC 11100 

AGAAAGACAT CTAAGTTGCT GACCTATCTG ACTGGATGGA TGGAAGAATA TCTTGTTTCT 11160 

GAGAGAAAAA AAGACTTTGG GTTTAAATTT GTACTTGATG AATTAAGGTA CTTTTAATAT 11220 

TCAAATGGAT TTGCCTGGCA GGCACTTGAA GATATTAGTC TAAATCTCAG AAACAGAATA 112 80 

TGATCTGAAG CTCTAAATTT GTGATATTCA ATATAAATAC TTTAGAGTCA TTGGGATAAA 11340 

TATGGTAGTT GTAGCTAAAA GCAAAAATAA GATACTAGGG AGAAAGGATA AAGTTAGAAG 11400 

AAAGAAGAAT CTAGAATTGA CCTTGAAGTA TATCAGCATG TGTAAAGATC AGGAATTGAT 11460 

CATTTTTATT TTCCAGAAAG TAGCTTTTCT TAGGGTTCCA TATTTACTCC CATAGATTCT 11520 
TCCC 



MNSLSEANTK PMPDLPQQPR KSKENNIFYS PISITSALGM VLLGAKDKTA QQISKVLHFD 
QVTENTTEKA ATYHVDRSGN VHHQFQKLLT EFNKSTDAYE LKIANKLFGE KTYQFLQEYL 
DAIKKFYQTS VESTDFANAP EESRKKINSW VESQTNEKIK NLFPDGTIGN D 
YFKGQWENKF KKENTKEEKF WPNKNTYKSV QMMRQYHSFN FALLEDVQAK V 
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LSMIVLLPNE IDGLQKLEEK LTAEKLMEWT SLQNMRETCV DLKLPRFKHE ESYDLKDTLR 300 

TMGMVNIFNG DADLSGMTWS HGLSVSKVLH KAFVEVTEEG VEAAAATAW WELSSPSTN 360 
EEFCCNHPFL FFIRQNKTNS ILFYGRFSSP 

Seq ID NO: 466 DNA sequence 
Coding sequence: 50.. 1240 

i r r i 1 r r 

GGAGAGAAGA AAGGAGGGGG CAAGGGAGAA GCTGCTGGTC GGACTCACAA TGAAAACGCT 60 

CCTTC1TTTG CTGCTGGTGC TCCTGGAGCT GGGAGAGGCC CAAGGATCCC TTCACAGGGT 120 

GCCCCTCAGG AGGCATCCGT CCCTCAAGAA GAAGCTGCGG GCACGGAGCC AGCTCTCTGA 160 

GTTCTGGAAA TCCCATAATT TGGACATGAT CCAGTTCACC GAGTCCTGCT CAATGGACCA 240 

GAGTGCCAAG GAACCCCTCA TCAACTACTT GGATATGGAA TACTTCGGCA CTATCTCCAT 300 

TGGCTCCCCA CCACAGAACT TCACTGTCAT CTTCGACACT GGCTCCTCCA ACCTCTGGGT 360 

CCCCTCTGTG TACTGCACTA GCCCAGCCTG CAAGACGCAC AGCAGGTTCC AGCCTTCCCA 420 

GTCCAGCACA TACAGCCAGC CAGGTCAATC TTTCTCCATT CAGTATGGAA CCGGGAGCTT 480 

GTCCGGGATC ATTGGAGCCG ACCAAGTCTC TGTGGAAGGA CTAACCGTGG TTGGCCAGCA 540 

GTTTGGAGAA AGTGTCACAG AGCCAGGCCA GACCTTTGTG GATGCAGAGT TTGATGGAAT 600 

TCTGGGCCTG GGATACCCCT CCTTGGCTGT GGGAGGAGTG ACTCCAGTAT TTGACAACAT 660 

GATGGCTCAG AACCTGGTGG ACTTGCCGAT GTTTTCTGTC TACATGAGCA GTAACCCAGA 720 

AGGTGGTGCG GGGAGCGAGC TGATTTTTGG AGGCTACGAC CACTCCCATT TCTCTGGGAG 780 

CCTGAATTGG GTCCCAGTCA CCAAGCAAGC TTACTGGCAG ATTGCACTGG ATAACATCCA 840 

GGTGGGAGGC ACTGTTATGT TCTGCTCCGA GGGCTGCCAG GCCATIGTGG ACACAGGGAC 900 

TTCCCTCATC ACTGGCCCTT CCGACAAGAT TAAGCAGCTG CAAAACGCCA TTGGGGCAGC 960 

CCCCGTGGAT GGAGAATATG CTGTGGAGTG TGCCAACCTT AACGTCATGC CGGATGTCAC 1020 

CTTCACCAT1 AACGGAGTCC CCTATACCCT CAGCCCAACT GCCTACACCC TACTGGACTT 1080 

CGTGGATGGA ATGCAGTTCT GCAGCAGTGG CTTTCAAGGA CTTGACATCC ACCCTCCAGC 1140 

TGGGCCCCTC TGGATCCTGG GGGATGTCTT CATTCGACAG TTTTACTCAG TCTTTGACCG 12 00 

TGGGAATAAC CGTGTGGGAC TGGCCCCAGC AGTCCCCTAA GGAGGGGCCT TGTGTCTGTG 1260 

CCTGCCTGTC TGACAGACCT TGAATATGTT AGGCTGGGGC ATTCTTTACA CCTACAAAAA 1320 

GTTATTTTCC AGAGAATGTA GCTGTTTCCA GGGTTGCAAC TTGAATTAAG ACCAAACAGA 13 80 

ACATGAGAAT ACACACACAC ACACACATAT ACACACACAC ACACTTCACA CATACACACC 1440 

ACTCCCACCA CCGTCATGAT GGAGGAATTA CGTTATACAT TCATATTTTG TATTGATTTT 1500 

TGATTATGAA AATCAAAAAT TTTCACATTT GATTATGAAA ATCTCCAAAC ATATGCACAA 1560 

GCAGAGATCA TGGTATAATA AATCCCTTTG CAACTCCACT CAGCCCTGAC AACCCATCCA 1620 

CACACGGCCA GGCCTGTTTA TCTACACTGC TGCCCACTCC TCTCTCCAGC TCCACATGCT 1680 

GTACCTGGAT CATTCTGAAG CAAATTCCGA GCATTACATC ATTTTGTCCA TAAATATTTC 1740 

TAACATCCTT AAATATACAA TCGGAATTCA AGCATCTCCC ATTGTCCCAC AAATGTTTGG 1800 

CTGTTTTTGT AGTTGGATTG TTTGTATTAG GATTCAAGCA AGGCCCATAT ATTGCATTTA 1860 

TTTGAAATGT CTGTAAGTCT CTTTCCATCT ACAGAGTTTA GCACATTTGA ACGTTGCTGG 1920 

TTGAAATCCC GAGGTGTCAT TTGACATGGT TCTCTGAACT TATCTTTCCT ATAAAATGGT 1980 

AGTTAGATCT GGAGGTCTGA TTTTGTGGCA AAAATACTTC CTAGGTGGTG CTGGGTACTT 2 04 0 

T GCTGGTGCCT CTCTATTGGT AATGTTAAGA 2100 



I I I I I I 

MKTLLLLLLV LLELGEAQGS LHRVPLP.RHP SLKKKLRARS QLSEFWKSHN LDMIQFTESC 60 

SMDQSAKEPL INYLDMEYFG TISIGSPPQN FTVIFDTGSS NLWVPSVYCT SPACKTHSRF 120 

QPSQSSTYSQ PGQSFSIQYG TGSLSGI IGA DQVSVEGLTV VGQQFGESVT EPGQTFVDAE 160 

FDGILGLGYP SLAVGGVTPV FDNMMAQNLV DLPMFSVYMS SNPEGGAGSE LIFGGYDHSH 240 

FSGSLNWVPV TKQAYWQIAL DNIQVGGTVM FCSEGCQAIV DTGTSLITGP SDKIKQLONA 300 

IGAAPVDGEY AVECANLNVM PDVTFTINGV PYTLSPTAYT LLDFVDGMQF CSSGFQGLDI 360 
HPPAGPLWIL GDVFIRQFYS VFDRGNNRVG LAPAVP 

Seq ID NO: 468 DNA sequence 

Nucleic Acid Accession ft : NM_018058.1 

Coding sequence: 319.. 1575 

i r r u i 1 t 

TACGCGCTGC GGGACCGGCA GGGGAACGCC ATCGGGGTCA CAGCCTGCGA CATCGACGGG 60 

GACGGCCGGG AGGAGATCTA CTTCCTCAAC ACCAATAATG CCTTCTCGGG GGTGGCCACG 12 0 

TACACCGACA AGTTGTTCAA GTTCCGCAAT AACCGGTGGG AAGACATCCT GAGCGATGAG 180 

GTCAACGTGG CCCGTGGTGT GGCCAGCCTC TTTGCCGGAC GCTCTGTGGC CTGTGTGGAC 240 

AGAAAGGGCT CTGGACGCTA CTCTATCTAC ATTGCCAATT ACGCCTACGG TAATGTGGGC 300 

CCTGATGCCC TCATTGAAAT GGACCCTGAG GCCAGTGACC TCTCCCGGGG CATTCTGGCG 360 

CTCAGAGATG TGGCTGCTGA GGCTGGGGTC AGCAAATATA CAGGGGGCCG AGGCGTCAGC 42 0 

GTGGGCCCCA TCCTCAGCAG CAGTGCCTCG GATATCTTCT GCGACAATGA GAATGGGCCT 4 60 

AACTTCCTTT TCCACAACCG GGGCGATGGC ACCTTTGTGG ACGCTGCGGC CAGTGCTGGT 540 

GTGGACGACC CCCACCAGCA TGGGCGAGGT GTCGCCCTGG CTGACTTCAA CCGTGATGGC 600 

AAAGTGGACA TCGTCTATGG CAACTGGAAT GGCCCCCACC GCCTCTATCT GCAAATGAGC 660 

ACCCATGGGA AGGTCCGCTT CCGGGACATC GCCTCACCCA AGTTCTCCAT GCCCTCCCCT 720 

GTCCGCACGG TCATCACCGC CGACTTTGAC AATGACCAGG AGCTGC-AGAT CTTCTTCAAC 780 

AACATTGCCT ACCGCAGCTC CTCAGCCAAC CGCCTCTTCC GCGTCATCCG TAGAGAGCAC 840 

GGAGACCCCC TCATCGAGGA GCTCAATCCC GGCGACGCCT TGGAGCCTGA GGGCCGGGGC 900 

ACAGGGGGTG TGGTGACCGA CTT CGACGGA GACGGGATGC TGGACCTCAT CTTC-TCCCAT 960 

GGAGAGTCCA TGGCTCAGCC GCTGTCCGTC TTCCGGGGCA ATCAGGGCTT CAACAACAAC 102 0 

TGGCTGCGAG TGGTGCCACG CACCCGGGTT GGGGCCTTTG CCAGGGGAGC TAAGGTCGTG 1080 

CTCTACACCA AGAAGAGTGG GGCCCACCTG AGGATCATCG ACGGGGGCTC AGGCTACCTG 1140 

TGTGAGATGG AGCCCGTGGC ACACTTTGGC CTGGGGAAGG ATGAAGCCAG CAGTGTGGAG 1200 

GTGACGTGGC CAGATGGCAA GATGGTGAGC CGGAACGTGG CCAGCGGGGA GATGAACTCA 1260 



365 
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GTGCTGGAGA TCCTCTACCC 
ACACCAATGA ATGCAT CC AG 
ACACCTATGG AAGCTACAGG 
ACGAGGATGG CACAGCCTGC 
CCCCCACCGC TGCTGCTGCC 
CACCGGTCCT CGTAGATGGA 
CCAGCTGCTG AGCAGGGGTG 
AAGTGGGCTT GTGCTGCTGC 
CCCAAGCCCA TCCATGCACA 
CTGTGCTGGG CACATAGCTG 
ATTCCAGTGG GTCTAATGAC 
CTGCACAGGA AGTATGAGGA 
AAAGCTATGT GACCTTACAC 
AAATGGGGAT TAAGAATAGA 



TTCCCATTCG 



GACACACTTC AGGACCCAGC CCCACTGGAG 
TGTGCCCTCG AGACAAGCCC GTATGTGTCA 
ACAAGAAGTG CAGTCGGGGC TACGAGCCCA 



GGACATGAAC 
CTAGACAGTA 
TTACTTAGCT 



CTGCTGCCGC 
TGGGGTOGGT 
CAGCGGATGG 



CATATCTTAG 



3 GCACATAGTA 



AACAATTAGG 
AGACAGGGT C 
GACACAGATG 
CTGAGTTCAA 
ACITGTTAGC 
TAGTGTGGAG 
AAGGCTCAAT 



GGCCIGGGAG 
GAGACTCGTA 

GCTGCCCTGA 



ATCCIGATTC 



GGTGGTGTCA 
AGGAACTCAC 
CGCATCTGCA 
ATGTATGTAA : 
GCCTCTCACT 
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I 



I 



I 



I 



MDPEASDLSR GILALRDVAA EAGVSKYTGG RGVSVGPILS S 
RGDGTFVDAA ASAGVDDPHQ HGRGVALADP NRDGKVDIVY GNWNGPHRLY LQMSTHGKVR 
FRDIASPKFS MPSPVRTVIT ADFDNDQELE IFFNNIAYRS SSANRLFRVI RREHGDPLIE 
ELNPGDALEP EGRGTGGWT DFDGDGMLDL ILSHGESMAQ PLSVFRGNQG FNNNWLRWP 
RTRVGAFARG AKWLYTKKS GAHLRIIDGG SGYLCEMEPV AHFGLGKDEA SSVEVTWPDG 
KMVSRNVASG EMNSVLEILY PRDEDTLQDP APLETPMNAS SSH3CALETS PYVSTPMEAT 
GAGPTRSAVG ATSPTRMAQP AWGLSASHRA PAPPPPPLLL PLPLLLPLLE LPLLHRSS 



Seq ID NO: 470 D 
Nucleic Acid Acc 
Coding sequence; 



i sequence 



I 



CAGCGGGCTG 
AGTAATCCCA 
TTTGAGATCG 
CAGAAGCGGC 
GACCGGCAGG 
GAGATCTACT 
TTGTTCAAGT 
CGTGGTGTGG 
GGACGCTACT 
ATTGAAATGG 
GCTGCTGAGG 
CTCAGCAGCA 
CACAACCGGG 
CACCAGCATG 



TGTTACCGTT 
AACCCATGTT 
CCCAGCTCAA 
TCGTGGCGGG 
TGGTGAACAT 



CCTGCTGCTG C 



CTATGGTGTG 
GTACAATGGA 
CGCGGTCGAT 
CGGGGTCACA 



CCCAACCTGG 
GAGCGCAGCT 
GCCTGCGACA 
TTCTCGGGGG 
GACATCCTGA 



TGCCCATCAC TGAGGGGTCC 
TTCTGCCTCC TGACTATGAC 
ATGTGGACCA TGATGGGGAC 



ACCCTGAGGC 



CAAATATACA 



ATCACCGCCG 



ATCGAGGAGC 
GTGACCGACT 
GCTCAGCCGC 



AAGAGTGGGG 



GCGATGGCAC 
GGCGAGGTGT 
ACTGGAATGG 
GGGACATCGC 
ACTTTGACAA 
CAGCCAACCG 
TCAATCCCGG 
TCGACGGAGA 
TGTCCGTCTT 
CCCGGTTTGG 
CCCACCTGAG 
ACTTTGGCCT 



CGCCCTGGCT 



GATGGCAAGA TGGTGAGCCG GAAC 
CTCTACCCCC 
TTCTCCCAGC 



CTCACCCAAG 
TGAC CAGGAG 
CCTCTTCCGC 
CGACGCCTTG 
CGGGATGCTG 
CCGGGGCAAT 
GGCCTTTGCC 
GAT CATCGAC 
GGGGAAGGAT 



IGTGGCCT 
GCCTACGGTA 
TCCCGGCGCA 
GGGGGCCGAG 
GACAATGAGA 
GCTGCGGCCA 
GACTTCAACC 
CTCTATCTGC 
TTCTCCATGC 



CACCCTACTA 
TCGACGGGGA 
TGGCCACGTA 
GCGATGAGGT 
GTGTGGACAG 



CAACGTGGCC 



CAGGGCTTCA 
AGGGGAGCTA 
GGGGGCTCAG 
GAAGCCAGCA 



GTGCTGGTGT 
GTGATGGCAA 
AAATGAGCAC 
CCTCCCCTGT 
TCTTCAACAA 
GAGAGCACGG 
GCCGGGGCAC 
TGTCCCATGG 
ACAACAACTG 



TGATGCCCTC 
CAGAGATGTG 
GGGCCCCATC 
CTTCCTTTTC 
GGACGACCCC 
AGTGGACATC 



CCGCACGGTC 
CATTGCCTAC 
AGACCCCCTC 



AGAGTCCATG 



GTGTGCCCTC G 



AACAAGAAGT 
CTCGGCCAGT 
GCTGCTGCCG 
CTGGGGTCGG 



AGGGATGTAA 



CAGACAGGGT 



CACCGGGCCC 
CTGCTGGAGC 
TGGTTAAGGA 
GAGTCCAGCA 
AGGCCTGGGA 
GGAGACTCGT 
CGCTGCCCTG 



CACACTTCAG 
CCATTGCATG 
CGTATGTGTC 
CTACGAGCCC 



A TGAACTCAGT GCTGGAGATC 



CACTGGAGTG 
AATGCATCCA 
GAAGCTACAG 



TGGCCAAGGA 
GTTCCCATTC 
GTGCCGGACC 



TGCCACTGCT G 



CCTGAGTTCA 
AACTTGTTAG 
TTAGTGTGGA 

AAAGGCTCAA 



AATCCTGATT 
CCATCCATTA 
GATTAGATTA 



GGGGAGTGGG 
GCTAGACCCT 
AAGGC CAGGC 
ATGGCGCTTA 
AGGTGGTGTC 



AAAGTGGGCT 



TCGCATCTGC 
AATGTATGTA 

TGCCTCTCAC 



CCTGTGCTGG 
CATTCCAGTG 
ACTGCACAGG 
CAAAGCTATG 



AGACACTTGG 



TGTGCTGCTG 
ATCCATGCAC 
GCACATAGCT 
GGTCTAATGA 
AAGTATGAGG 
TGACCTTACA 
TTAAGAATAG 
CACAAAACCT 
CAACACG 



CCTAGACAGT 
ATTACTTAGC 
GTGATCACAG 
CCATATCTTA 
ACTTTAGTGT 
CCAGTCACTT 
AATCTTGGGG 
GGCACATAGT 



2220 
2280 
2340 



MSRMLPFLLL LWFLPITEGS QRAEPMFTAV TNSVLPPDYD SNPTQLNYGV AVTDVDHDGD 
FEIWAGYNG PNLVLKYDRA QKRLVHIAVD ERSSPYYALR DRQGNAIGVT ACDIDGDGRE 
EIYFLNTNNA FSGVATYTDK LFKFRHNRWE DILSDEVNVA RGVASLFAGR SVACVDRKGS 



366 



WO 02/086443 

GRYSIYIANY AYGNVGPDAL 
LSSSASDIFC DNENGPNFLF 
VYGNWNGPHR LYLQMSTHGK 
RSSSANRLFR VIRREHGDPL 
AQPLSVFRGN QGFNNNWLRV 
PVAHFGLGKD EASSVEVTWP 
FSQQENGHCM DTNECIQFPF 
LGQSPGPRPT TPTAAAATAA 

Seq ID NO : 472 DNA sec 



SRGILALRDV AAEAGVSKYT GGRGVSVGPI 
AAASAGVDDP HQHGRGVALA DFNRDGKVDI 
FSMPSPVRTV ITADFDNDQE L3IFFNNIAY 
EPEGRGTGGV VTDFDGDGML DLILSHGESM 
KSGAHLRIID GGSGYLCEME 
LYPRDEDTLQ DPAPLECGQG 
NTYGSYRCRT NKKCSP.GYEP NEDGTACVGT 
APVLVDGDLN LGSWKESCE PSC 
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AGCGGCTCCT 
GTTCTGAAGT 
TCACCCTACT 
ATCGACGGGG 
CACAGCAGCT 



CGGGAGGACT 
CCCCAGCATC 
ATGACCGGGC 



CCCTCCCCAT 



TCCTCCCTGG 



CAGCGCAGGT 
CCCCTGCAGG 
GTCAGGCTTC 
GACTGAGACC 



GGACCGGCAG 
CCCTTCTGGG 



TGCTCTGGTT GGATGGGACT 
TCCTCCTCCA GGTACAATGG 
CTGGTGAACA TCGCGGTCGA 



: TCCGGACAGC 
I TACCCATGAA 
GGGGTGGCCA CGTACACCGA CAAGTTGTTC 
[* GGCCCGTGGT 
! CTCTGGACGC 
I CCTCATTGAA 



TTCCTCAACA CCAATAATGC 
CTCCACAGAA ACAGGCCTGT 
CTGCCTCCAC TCAGCGGAAG 



ACCCAACCTG 
TGAGCGCAGC 
AGCCTGCGAC 



GCTGAAGCCT 



CTGAGCGATG 
GCCTGTGTGG 
GGTAATGTGG 



AGGT CAACG T 
ACAGAAAGGG 
GCCCTGATGC 
CGCTCAGAGA 
TTCTCCCACA CTGCCTCTCC 
GGAGGAGACC CAGAGGAGGC 
TGCCGGCTGG GCTGGAAGGA 



CCAGAACCAT 
AAGTTCCGCA 
GTGGCCAGCC 
TACTCTATCT 
ATGGACCCTG 



ATAACCGGTG 
ACATTGCCAA 



TGTGGCTGCT GAGGCTGGGG TCAGCAAATA 
T GAGATATCTG GCAGAACCGA 



TCCAAAAGCC 
GCGCCTTCTC 
CCCCTTGTCA 
CCCCACCCCC 



ATTTGGCTGA 
CAGCCCACCC 
CTCAGCTAAT 
GAGCCCCAGG 
AGGCTTTGGG 



CCCAAGGTCA 
GGCCCCGGGA 
CTCTCCCATC 



CTGGCGTGGA 
TTTAGGCTCA 
CTGCAGTTCC 
TCTGCCACTC 
ATCCTCAGCA 
TTCCACAACC 
GCCTTCATCG 
CTAGCAGAAA 
CCACATTGCC 
TTCTTGACGC 



CTCCCATTTT 
CACAGGAGTG 
GGGTGGCCAA 
CCCTGGTCCC 
CTGCCCTGCC 
ACCAGATGGA 
GGAAAGCACG 
CCTCAGGCCT 



TTCACCTCAA 
CTGGTCCTTC 
ATCATGGTTT 
AAGGCTTGGC 
CACCCTGCCT 
ACATTGTCCT 
AAAGAGTCAA 



CCCCACCGCC TCTATCTGCA 



CAAGAACCTA 
TTTCCCTGCC 
GACACATGGA 
AATGGACCCC 
CGCGTGGCCA 
CAGGCAGAAG 
CCAAGCCACA 
ACAAAGAACA 
CCATCTAGTG 
GCGAGAGATT 
CAACTTCCCC 
TGGGAATCCT 
AAAAGAGGAG 
GGAAGCAGAA 
CAGAGGCAGC 
GATGTCTTTT 
CGATATCTTC 
CACCTTTGTG 



TTTGGCCCAC 
CGCCAAGCCC 
CGTCTGGCTG 
AAATGTAAGG 
GCGCTCAGCA 



ATGGAAGCAC 
CAGCAGCTTT 
TTCGAACAGC 
CATGTTACTA 
CCCAACACTA 
GAAAACTAGC 
GCCGCCATGC 
CCACTGTGGT 



ACCCAAATCA 
GGAAGACATC 
ACGCTCTGTG 
TTACGCCTAC 
CCTCTCCCGG 
TACAGAAGGC 
GGAGCGGGAA 
CAGCCAACTG 
GGTGGAGGAA 
TCTGCAGACT 
TTCTGTCTGC 
CCCTGTAGCC 
CCGGAGTGTC 
TGAGCCCGGC 



ACTGCCTATT 



TCTGGCAAGA 



CGTGGGTGTG 



GACCAGGAGC 



AAGGT CAACA 
AGAGGCTGTG 
AAAGGGAAGG 
CCACACTACC 
GTCCAATCAC 
CGGGGTCCAA 
GCTACGGGCT 



TGGAGATCTT 
GCTCCATCCT 
AAGGTTTAAG 
CAGGTCCCCT 



GAAATGTGGC 



TACCAGGAAA 



CCAATCACTA 
GGCTCCAATC 
TACGGGCTCC 
GGGCTACAGG 



AATGAGCACC 
CTCCCCTGTC 
CTTCAACAAC 
GGCTCGTGGC 
AATCAGAAGG 
GATGAAGAAA 
GCAAAGCCTG 
CCAAAGTGTG 
GCTACAGGGT 
AGGGGCTACG 
GAAAAGGGGC 



GCCACCATGC 
GQGAGAGAGA 
AGCTGCTTGA 
GGGAACTGGG 
GGGAAGATTC 
TTCCCCCCAG 
CCTGTCCTCC 
CTAGGGGGCC 
TGCGACAATG 
GACGCTGCGG 
TGCAGAGATT 
TGCCCGTGGC 
TTTACAAGGA 
CACCGGAGGA 
GCTCCCTGTG 
ATCCCAGAGA 
GACGACCCCC 

CATGGGAAGG 
CGCACGGTCA 
ATTGCCTACC 



CTGGGGCAGT 
GGCCTCTTGA 
TTCTGGACAT 
ATGGAGACCA 
GCTCCTCTGA 
AGGTGGGCCT 
GAGGCGTCAG 



GGGACTCGAG 



CCAGTGCTGA 
TTCCTCACTC 
ATGCACGTCT 
CCGGGTCACG 
CACTCAGCCT 



AGCCGGGACA 
GGCCAAGGCC 
TGAGCCCAGA 
GGAGCCTCTG 
GGGGCTTGCT 
CGTGGGCCCC 
TAACTTCCTT 



GGAGGGTTCC 



GCTATGGGGT 
AGAGAGCACG 



CCAATCACTA 
GGGTCCAATC 
GAGACCCCCT 
CAGGGGGTGT 
GAGAGTCCAT 
GGCTGCGAGT 
TCTACACCAA 
GTGAGATGGA 



CGGGCTCCAA 
GCTACAGGGT 
AGGGGCTACG 
GAAAAGGGGC 



GCCAAGGAGC 
CCCAGAACCC 
CCAATCACTA 
GGGTCCAATC 
TACGGGGTCC 
GGGCTACAGG 
AAAGGGGCTA 
AGGAAAAGGG 
TACCAGGAAA 



TCCGCTTCCG 
TCACCGCCGA 
GCAGCTCCTC 
TGACAGCTGG 
CAGGGCCAGG 
GGAAGGACGA 
CGGCCTCTGC 
AAGCGCCACA 
CCAGGAAAAG 
ACTACCAGGA 
AATCACTACC 
GTCCAATCAC 
CAGGGTCCAA 



GGACATCGCC 
CTTTGACAAT 
AGCCAACCGC 



GGGTCAGGCC 
GGACTGGGCA 
TATTGCAGGG 
AGATACAAAG 
GGGCTACGGG 
AAAGGGGCTA 
AGGAAAAGGG 



AGGGGCTACG 
GAAAAGGGGC 
CCAGGAAAAG 
ACTACCAGGA 



CCAATCACTA 
GGGTCCAATC 
TACGGGGTCC 
GGGCTACAGG 
AAAGGGGCTA 
AGGAAAAGAG 



ACT AC CACAG AAAGGGGCTA CGGGGTCCAA 
CATCGAGGAG CTCAATCCCG GCGACGCCTT 
GGTGACCGAC TTCGACGGAG ACGGGATGCT 
GGCTCAGCCG CTGTCCGTCT TCCGGGGCAA 
GGTGCCACGC ACCCGGTTTG GGGCCTTTGC 
GAAGAGTGGG GCCCACCTGA GGATCATCGA 
GCCCGTGGCA CACTTTGGCC TGGGGAAGGA 
AGATGGCAAG ATGGTGAGCC GGAACGTGGC 



CGTCATCCGT 
GGAGCCTGAG 
GGACCTCATC 
TCAGGGCTTC 
CAGGGGAGCT 



CCTGTGCCAC 
TCTTCAGGCT 
GTTCTATTCA 
CCAGGGTTCT 
TCTGATCCCC 

GCGAGGTGTC 



2700 
2760 
2820 

2940 

30S0 
3120 
3180 

3300 
3360 

3480 
3540 



3730 



CAGCGGGGAG 4320 
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ATGAACTCAG TGCTGGAGAT CCTCIACCCC CGGGATGAGG ACACACTTCA GGACCCAGCC ■ 
CCACTGGAGT GTGGCCAAGG ATTCTCCCAG CAGGAAAATG GCCATTGCAI GGACACCAAT . 
GAATGCATCC AGTTCCCATT CGTGTGCCCT CGAGACAAGC CCGTATGTGI CAACACCTAT . 
GGAAGCTACA GGTGCCGGAC CAACAAGAAG TGCAGTCGGG GCTACGAGCC CAACGAGGAT 
GGCACAGCCT GCGTGGGTAC TGAGCTAGGC TCTAGGCATA CAATGACGTG GAAACCAAGG 
CCCAAAAAGG AGCTGCAACT TTCCCAAGGC ATCTGCACCC CCGTCTGGTC CTTTTTCCTG . 
CCGGGTTGCC GGCTGCTCCT CAAAAGAGCT CAGCTCCAGG CTGCICCCAG CACCCTTCTC 
CAGAAAGCTC CAGGTATTCC AGAAGCCCAA GTGTATGAAC AAGATCAGGA ATAA 



PCT/US02/12476 



3 IDGDGREEIY 
S SSLGQASPDS 
I LSDEVNVARG 
R GILALRDVAA 
GGDPEEADEE HSGDGSTSQL CRLGWKDGQF 



QHLPARELYD I 



PLNTNNAPSG 
RQGERVPVPC 
VASLFAGRSV 
EAGVSKYTEG 
KEEAAALVEE 
RQAPQHYPVA 
ALSTTWPGG 



VLKYDRAQKR L 
HSSSAQVP3G LHRNRPVLKP 
CRGGLRPTHE PEPPLLRPKS 



LAWNQMEKEE 
SATHCGSMSF 
AFIVHLKYHL 
FLTQGLASSA 
LSSERVNVGV 
SPKFSMPSPV 
GQGEGLRIRR 
KGKGNVAQSV 
RGPITTRKRG 
RKGLRAPITT 
NHYQEKGLQG 



GKIHGDHEPR 
CRDFPHSLCH 



SCLRPLEAGT 



RTVITADFDN 



PRTQAPQDTK 
YGVQSLPGKG 
RKRGYGVQSL 



ILS3SASDIF CDNENGPNFL 
LAETGPSSSC CPWHARLLQA 
QGAPPCLLAR 
ALADFNRDGK 
DQELEIFFNN IAYRSSSANR 
KVNTGPLMKK QKGRKDEDWA 
PHYHKKGLOG PITTRKRGYG 
ATGSNHYQEK GLQGPITTRK 
PGKGATGSNH YQEKGLRGPI 



:g eisgrteere 
qreagaagvp rgrvrtalqt 
plvtqlmthg rlagklarsv 
lrsweesrqk gqamsrcalr 
pkvtqechlv atmpalggle 
vpgaalpgnp gnwvldmaka 
pvlqvglgla 
daaasaerrl 

PHCHHGLSMS F 



PHRL.YLQMST 
LFRCSILARG 
RGCGNAGQSL 
VQSLPGKGAT 



AMGSNHYQEK GLRAPITTRK RGYGVQSLPQ KGATGSNVIR 
LSHGESMAQP LSVFRGNQGF 
GYLCEMEPVA HFGLGKDEAS 
PLECGQGFSQ QENGHCHDTN 
GTACVGTELG SRHTMTWKPR 
QKAPGIPEAQ VYEQDQE 



TTRKRGYGLQ 
RGPITTRKRG 
REHGDPLIEE 



AKEPASAIAG 
GSNHYQEKGL 
KGATGSNHYH 



KWLYTKKSG AHLRIIDGGS 
MNSVLEILYP RDEDTLQDPA 
GSYRCRTNKK CSRGYEPNED 
PGCRLLLKRA QLQAAPSTLL 



SVEVTWPDGK 
ECIQFPFVCP 
PKKSLQLSQG 



YGLQSLPGKE 
LNPGDALEPE 
TRFGAFARGA 
MVSRNVASGE 



Seq ID NO: 474 DNA s 
Nucleic Acid Accession #: 
Coding sequence: 1..1152 



ATGAGTGCAC 
CAAAACGTTC 
GCTGCTGGCA 
AAGGAAAAAG 
GGATTCGTGG 



TTTTCCTTGG 
CAAGTGGGAC 
CCATGGACCC 
TGAGCACACA 
CTGCTGCTGA 



TACAGAAACT 
AGAAGGCTCC 
AATGTGGTGT 
CTGGCACCCT 



I 

~ TGTGGGAGTG 
AGATACTGGA 
AGAGAGCAGT 
GAATCTGCTA 
ACTGCCCAGG 
GATCATGAAA 



GATCCTCAAA 
CTCCTGCTGA 



AAGCTGGAGC 
GTAAGCCCCT 
AGGATGCCAT 
CTGATAATGA 
ATGAGCTCCG 



GAGGGTGCAA 6 0 
CGGTGACTGG 120 



ACACAAGCCC 
GAGTTTTTGG 
ACACGAGGCA 
GTACCGCATG 



GTGCCCTTGC 
CTGGCTCTCT 
TCACAGAGGG 
CTTTGACCGG 
AAG CCCACG A 
GTGAGAACAT 
TTGGGAAGGA 
CCTCAGCCTC 



AGATGGGGTT C 



AGGCAGCCTT G 



GTGAGCTTGA 
ACAAAGGCAC 
TGACCCTCGT 



3 ACTACGGAAA 



ATCCAACTTT 
CATCCGTGCC 
ACGCCCCCGG 



CTTTCCTTAG 
CTCAGACGAG 
GTCACTGAGC 



CGGCATGGGT 
GGAGTTGGGA 
GAAGTGGTGG 
GGAGGTGAGG 
TTACCAACTC 




MSALFLGVGV RAEEAGARVQ QNVPSGTDTG DPQSKPLGDW AAGTMDPE3S IFIEDAIKYF 

KEKVSTQNLL LLLTDNEAWN GFVAAAELPR NEADELRKAL DNLARQHIMK DKNWHDKGQQ 

YRNWFLKEFP RLKSELEDNI RRLRALADGV QKVHKGTTIA NWSGSLSIS SGILTLVGMG 

LAPFTEGGSL VLLEPGMELG ITAALTGITS STHDYGKKWN TQAQAHDLVI KSLDKLKEVR 

EFLGENISNF LSLAGNTYQL TRGIGKDIRA LRRARANLQS VPHASASRPR VTEPISAESG 

EQVERVNEPS ILEMSRGVKL TDVAPVSFFL VLDWYLVYE SKHLHEGAKS ETAEELKKVA 
QELEEKLNIL NNNYKILQAD QEL 

Seq ID NO: 476 DNA sequence 

Nucleic Acid Accession #: NM_014452.1 

Coding sequence: 1..19S8 
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ATGGGGACCT CTCCGAGCAG 
GCCACAGCCA CGATGATCGC 
GCTCAGCCAG AACAGAAGGC 
ACCGGCCAGG TGCTAACCTG 
ACCAACACAA GCCTGCGCGT 
AATGGCATAG AGAAATGCCA 
TTACCTTGTG CTGCCTTGAC 
AACGCTACCT GTGCCCCCCA 
ACAGAGACTG AGGATGTGCG 
TCTAGTGTGA TGAAATGCAA 
AAGCCGGGGA CCAAGGAGAC 
ACCTCACCTT CCCCTGGCAC 
GTCCCTTCCT 
TCTGTTAGAC 
T CAGCAAGGG GGAAGGAAGA 
CAGCAAGGCC CCCACCACAG 



•C CTCGCCTCCT GCA . 3 
GGGCTCCCTT CTCCTGCTTG GATTCCTTAG 
CTCGAATCTC ATTGGCACAT ACCGCCATGT 



I 



PCT/US02/12476 



CTGCAGCAGT 
TGACTGTAGT 
TGACCGAGAA 
TACGGTGTGT 
GTGTAAGCAG 



AGCCATCTTT 



CAGCCATGCC 
TGCACTTGCC 
CCTGTGGGTT 
TGTGCTCGGG 
GACTGTCTGA 
TGTGGCACAC 
CCACGCCCTG 
ATGAACTCAA 



GGACCTTTAC 
CATGGCCAAT 
CACCTGGCAT 



CACCACCACA 
TGACCGTGCC 

tgagcattgt 

CAGGCATGAG 
GATTGAGAAA '< 
GTTCCAGTCT 
GAAGAAAGGG 
AGATGTGCCT 



CTCCAGCTCC 
AACCCATGAA 
CTCTTCTGCC 



3 ACCCTCCCAA A. 



AGTCAACCAC 
GGCCACTGGG 
ACAGAACCTA 
CCTGCTGCTG 
GAAAAAGGGG 



ACCCAGAACC 
CTTGTAGCAG 
AGTGAGAGGG 
GCAGCTCTGC 
GCCCTGCGCC 



GGGAGAAATG G 



AGCCCCATCC 
TCCCCACAGG 
GACTCTACAT 
AAGAAGGACA 
GATGACATGC 
GCTGAGGACA 
CAGACCCTCC 



CCAGCCCCAA 
CCAGCGGCTC 



AAAGATATCT 
GGGTACACAG 
CCCGAGGCCA 

CTCCCGATGA 



iTATCGA 
CCGACCACGA 
TTCGTGGGCT 



TATCCTGAAG 
GCGGGCCTAC 



TCCACTTTCT 
AACTAGACCG 
TGGACTCTGT 



GGGCTTCTTC 
CTCCGCGCTG 
GCAGGTACGC 
AAATCCTGAG 
GCTATTCGAA 
TTATAGCCAT 



GTGGATGAGT 
AGCAGGAACG 
CTGGACCCCT 



CTCTCCTGAC 
CGGAGCCCCT 
GTTCCTTTAT 



GATGGAAGAC 
GCTTAGCCCG 
GGTGGAGCCT 
TCTCCGCTGT 
TACCAAAGAA 
GCCTATCTTT 



A GGAAGCCAGC 



Seq ID NO: ill Protein 



TGQVLTCDKC 
LPCAALTDRE 
SSVMKCKAYT 
VPSSTYVPKG 
QQGPIIIIRHIL 
VLVVTVVCSI 
LVAAQVGSQW 
ALRQIIRRNDV 
SPQDKNKGFF 



I 

LAS 

PAGTYVSEHC 
CTCPPGMFQS 
DCLSQNLVVI 
MNSTESNSSA 
KLLPSMEATG 
RKSSRTLKKG 
KDIYQFLOJA 



I 

ATATMIAGSL 
TNTSLRVCSS 
NATCAPIITVC 
KPGTKETDNV 
SVRPKVLSSI 
GEKSSTPIKG 
PRQDPSAIVE 
SEREVAAFSN 



LLLGFLSTTT AQPEQKASNL 
CPVGTFTRHE NGIEKCHDCS 
PVGWGVRKKG TETEDVRCKQ 
CGTLPSFSSS TSPSPGTAIF 



IGTYRHVDRA 
QPCPWPKIEK 
CARGTFSDVP 



Q AEDKLDRLFE 



PKRGIIPRQNL IIKHFDINSIIL 
TQNREKHIYY 
AALQHWTIRG 
LPMSPSPLS? SPIPEPNAKL 
SRNGSFITKE KKDTVLRQVR 
IIGVKSQEAS QTLLD SVYSH 



TLPNLQWNH 
PWMIVLFLLL 
CNGHGIDILK 
PEASLAQLIS 
ENSALLTVEP 
LDPCDLQPIF 



Seq ID NO: 478 DNA sequence 
Nucleic Acid 
Coding sequence 




GGAGGCGGGG 
AGTCCGGCCG 
CTGCGCACCG 



GCCCCCGGGG 

GCCCGAGCCG 

CGATGGGCCT 

TGCTGCTGCT CCTGCTGCTG 
CAGCCTGCCT 



GGCGGGGAGT 
TTCAAGGGCA 
AGCGGCAGTC 



GTGCTCGAGA 
ACCAGGAGCT 
AGGACCCACA 
ACCTGTTCAC 
ACTTCACCCT 
GTCCCTTCGA 
GAACAGTCAG 
CCACCAAGAC 
ACATTCCTGA 
GCGAGACTGG 
TCTGCAAGGG 
AGGCCCAGCT 
CTGCAGGATG TCTTCACGCT 



GCGCGACTGT C 



GCAGACGCAG AGAAGAAACA 



AAGGGCCGTT 
CTCTACACTG 
AGCCTTCGCC 
GCCTCAGCCT 



GGCAAGGGAC 
CCCGAATTTC 
CAGCTTCCAA 



GAGCCTGGGC 
CCAGGAATTT 
CGATGAGGGT 
GCTGTGCTCA 
GAGCCCCAGC 



GGGAATGACC 
CTCAACTGGC 
AGCTTGCAAG 
GAGTTCTTTG 



CGGCCCGACG 



3 AGTCTTCAGC 



ACTACAGAAG 
GGCCTCTACA 
CCCACACCCC 



ACCACTTCCT GATGGACGGG 
GCTACCAGCG CGTGGCTGTA 
TCCTGGGCAC TGGTGACGGC 
TCATTGAGGA GCTGCAGATC 



ATGTCCTCCT 
CCCTGGTGGT 
CGGCCATCTC 
TGCAAGACCC 
GCGATGATGA 
AGAACACCAT 
TGCTACAGCA 
ATGGCTTCCC 
GGCGTGACAC 
GCTCTGCCGT 
AGGAGGTGAA 
GGCCTGGAGC 
TCCCAGACCG 
GCCGCATGCT 
CTGGCCTGCA 
AGGCAGTGAG 



GCTGCCGCCT 
GCCTCCGACC 
ATTCCTCAGA 
TGGCAGGACC 
CTTCCTGCCA 
GCAGTGCAGC 
CCTGCCGCTC 
TACCTACATC 



GCGGAGCCAA 
AGCTTTTGTG 
CAAGATCTAC 
TGTGTCCCGC 
GCGCTGGACC 
CTTCAACGTG 
CCTTTTCTAT 
CTGTGTCTTC 



GTGCATCACC 
CGTGCTGAAC 
GCTGCTGCAG 



CAGGT CCGAA 
CACCGCGTCC 
CGGCTCCACA 
TTCTCATCGG GACAGCCCGT GCAGAATCTG 
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CTCCTGGACA CCCACAGGGG GCTGCTGTAT GCGGCCTCAC ACTCGGGCGT AGTCOAGGTG 180 0 

CCCATGGCCA ACTGCAGCCT GTACAGGAGC TGTGGGGACT GCCTCCTCGC CCGGGACCCC 1660 

TACTGTGCTT GGAGCGGCTC CAGCTGCAAG CACGTCAGCC TCTACCAGCC TCAGCTGGCC 192 0 

ACCAGGCCGT GGATCCAGGA CAT CG AGGGA GCCAGCGCCA AGGACCTTTG CAGCGCGTCT 1980 

5 TCGGTTGTGT CCCCGTCTTT TGTACCAACA GGGGAGAAGC CATGTGAGCA AGTCCAGTTC 204 0 

CAGCCCAACA CAGTGAACAC TTTGGCCTGC CCGCTCCTCT CCAAl 3GC GJ ^CGACTC 2100 

TGGCTACGCA ACGGGGCCCC CGTCAATGCC TCGGCCTCCT GCCACGTGCT ACCCACTGGG 2160 

GACCTGCTGC TGGTGGGCAC CCAACAGCTG GGGGAGTICC AGTGCIGGTC ACTAGAGGAG 2220 

GGCTTCCAGC AGCTGGTAGC CAGCTACTGC CCAGAGGTGG TGGAGGACGG GGTGGCAGAC 22 80 

10 CAAACAGATG AGGGTGGCAG TGTACCCGTC ATTATCAGCA CATCGCGTGT GAGTGCACCA 2340 

GCTGGTGGCA AGGCCAGCTG GGGTGCAGAC AGGTCCTACT GGAAGGAGTT CCTGGTGATG 2400 

TGCACGCTCT TTGTGCTGGC CGTGCTGCTC CCAGTTTTAT TCTTGCTCTA CCGGCACCGG 2460 

AACAGCATGA AAGTCTTCCT GAAGCAGGGG GAATGTGCCA GCGTGCACCC CAAGACCTGC 252 0 

CCTGTGGTGC TGCCCCCTGA GACCCGCCCA CTCAACGGCC TAGGGCCCCC TAGCACCCCG 2580 

15 CTCGATCACC GAGGGTACCA GTCCCIGTCA GACAGCCCCC CGGGGTCCCG AGTCTTCACT 2640 

GAGTCAGAGA AGAGGC CACT CAGCATCCAA GACAGCTTCG TGGAGGTATC CCCAGTGTGC 2700 

CCCCGGCCCC GGGTCCGCCT TGGCTCGGAG ATCCGTGACT CTGTGGTGTG AGAGCTGACT 2760 

TCCAGAGGAC GCTGCCCTGG CTTCAGGGGC TGTGAATGCT CGGAGAGGGT CAACTGGACC 2820 

TCCCCICCGC TCTGCTCTTC GTGGAACACG ACCGTGGTGC CCGGCCCTTG GGAGCCTTGG 2880 

20 GGCCAGCTGG CCTGCTGCTC TCCAGTCAAG TAGCGAAGCT CCTACCACCC AGACACCCAA 2940 

ACAGCCGTGG CCCCAGAGGT CCTGGCCAAA TATGGGGGCC TGCCTAGGTT GGTGGAACAG 3000 

TGCTCCTTAT GTAAACTGAG CCCTTTGTTT AAAAAACAAT TCCAAATGTG AAACTAGAAT 3060 

GAGAGGGAAG AGATAGCATG GCATGCAGCA CACACGGCTG CTCCAGTTCA TGGCCTCCCA 312 0 

GGGGTGCTGG GGATGCATCC AAAGTGGTTG TCTGAGACAG AGTTGGAAAC CCTCACCAAC 3180 

25 TGGCCTCTTC ACCTTCCACA TTATCCCGCT GCCACCGGCT GCCCTGTCTC ACTGCAGATT 3240 

CAGGACCAGC TTGGGCTGCG TGCGTTCTGC CTTGCCAGTC AGCCGAGGAT GTAGTTGTTG 3300 

CTGCCGTCGT CCCACCACCT CAGGGACCAG AGGGCTAGGT TGGCACTGCG GCCCTCACCA 3360 

GGTCCTGGGC TCGGACCCAA CTCCTGGACC TTTCCAGCCT GTATCAGGCT GTGGCCACAC 3420 

GAGAGGACAG CGCGAGCTCA GGAGAGATTT CGTGACAATG TACSCCTTTC CCTCAGAATT 3 480 

30 CAGGGAAGAG ACTGTCGCCT GCCTTCCTCC GTTGTTGCGT GAGAACCCGT GTGCCCCTTC 3540 

CCACCATATC CACCCTCGCT CCATCTTTGA ACTCAAACAC GAGGAACTAA CTGCACCCTG 3600 

GTCCTCTCCC CAGTCCCCAG TTCACCCTCC ATCCCTCACC TTCCTCCACT CTAAGGGATA 3660 

TCAACACTGC CCAGCACAGG GGCCCTGAAT TTATGTGGTT TTTATACATT TTTTAATAAG 3 72 0 
ATGCACTTTA TGTCATTTTT TAATAAAGTC TGAAGAATTA CTGTTT 



Seq ID NO: 479 Protein sequenc 



MLRTAMGIiRS WLAAPWGALP PRPPLLLLLL LLLLLQPPPP TWALSPRISL P 

RFEAEHISNY TALLLSRDGR TLYVGAREAL FALSSNLSFL PGGEYQELLW GADAEKKQQC 120 

SFKGKDPQRD CQNYIKILLP LSGSHLFTCG TAAFSPMCTY INMENFTLAR DEKGNVLLED 180 

GKGRCPFDPN FKSTALWDG ELYTGTVSSF QGNDPAISRS QSLRPTKTES SLNWLQDPAF 240 

VASAY1PESL GSLQGDDDKI YFFFSETGQE FEFFENTIV3 RIARICKGDE GGERVLQQRW 300 

TSFLKAQIiLC SRPDDGFPFN VLQDVFTLSP SPQDWRDTLF YGVFTSQWHR GTT EG 3 AVCV 3 50 

FTMKDVQRVF SGLYKEVNRE TQQWYTVTHP VPTPRPGACI TNSARERKIN SSLQLPDRVL 420 

NFLKDHFLMD GQVRSRMLLL QPQARYQRVA VHRVPGLHHT YDVLFLGTGD GRLHKAVSVG 480 

PRVHIIEELQ IFSSGQPVQN LLLDTHRGLL YAASHSGWQ VPMANCSLYR SCGDCLLARD 540 

PYCAWSGSSC KHVSLYQPQIj ATRPWIQDIE GASAKDLCSA SSWSPSFVP TGEKPCEQVQ 600 

FQPNTVNTLA CPLLSNLATR LWLRNGAPVN ASASCHVLPT GDLLLVGTQQ LGEFQCWSLE 660 

EGFQQLVASY CPEWEDGVA DQTDEGGSVP VIISTSRVSA PAGGKASWGA DRSYWKEFLV 720 

MCTLFVLAVL LPVLFLLYRH RNSMKVFLKQ GECASVHPKT CPWLPPETR PLNGLGPPST 7B0 
PIiDHRGYQSL SDSPPGSRVF TESEKRPLSI QDSFVEVSPV CPRPRVRLGS EIRDSW 

Seq ID NO: 480 DNA sequence 

Nucleic Acid Accession #: NM_004217.1 

Coding sequence: S8..1092 

1 11 21 31 41 51 

GGCCGGGAGA GTAGCAGTGC CTTGGACCCC AGCTCTCCTC CCCCTTTCTC TCTAAGGATG 60 

GCCCAGAAGG AGAACTCCTA CCCCTGGCCC TACGGCCGAC AGACGGCTCC ATCTGGCCTG 120 

AGCACCCTGC CCCAGCGAGT CCTCCGGAAA GAGCCTGTCA CCCCATCTGC ACTTGTCCTC 180 

ATGAGCCGCT CCAATGTCCA GCCCACAGCT GCCCCTGGCC AGAAGGTGAT GGAGAATAGC 240 

AGTGGGACAC CCGACATCTT AACGCGGCAC TTCACAATTG ATGACTTTGA GATTGGGCGT 300 

CCTCTGGGCA AAGGCAAGTT TGGAAACGTG TACTTGGCTC GGGAGAAGAA AAGCCATTTC 360 

ATCGTGGCGC TCAAGGTCCT CTTCAAGTCC CAGATAGAGA AGGAGGGCGT GGAGCATCAG 420 

CTGCGCAGAG AGATCGAAAT CCAGGCCCAC CTGCACCATC CCAACATCCT GCGTCTCTAC 480 

AACTATTTTT ATGACCGGAG GAGGATCTAC TTGATTCTAG AGTATGCCCC CCGCGGGGAG 540 

CTCTACAAGG AGCTGCAGAA GAGCTG CACA TTTGACGAGC AGCGAACAGC CACGATCATG 600 

GAGGAGTTGG CAGATGCTCT AATGTACTGC CATGGGAAGA AGGTGATTCA CAGAGACATA 660 

AAGCCAGAAA ATCTGCTCTT AGGGCTCAAG GGAGAGCTGA AGATTGCTGA CTTCGGCTGG 72 0 

TCTGTGCATG CGCCCTCCCT GAGGAGGAAG ACAATGTGTG GCACCCTGGA CTACCTGCCC 780 

C CAG AGATGA TTGAGGGGCG CATGCACAAT GAGAAGGTGG ATCTGTGGTG CATTGGAGTG 84 0 

CTTTGCTATG AGCTGCTGGT GGGGAACCCA CCCTTTGAGA GTGCATCACA CAACGAGACC 900 

TATCGCCGCA TCGTCAAGGT GGACCTAAAG TTCCCCGCTT CTGTGCCCAC GGGAGCCCAG 960 

GACCTCATCT CCAAACTGCT CAGGCATAAC CCCTCGGAAC GGCTGCCCCT GGCCCAGGTC 102 0 

TCAGCCCACC CTTGGGTCCG GGCCAACTCT CGGAGGGTGC TGCCTCCCTC TGCCCTTCAA 108 0 

TCTGTCGCCT GATGGTCCCT GTCATTCACT CGGGTGCGTG TGTTTGTATG TCTGTG7ATG 1140 

TATAGGGGAA AGAAGGG AT C CCTAACTGTT CCCTTATCTG TTTTCTACCT CCTCCTTTGT 1200 
TTAATAAAGG CTGAAGCTTT TTGT 

Seq ID NO: 481 Protein sequence 
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MAQKENSYPW P 



I 



3 LSTLPQRVLR KEPVTPSALV LMSRSNVQPT AAPGQKVMEN 
3KGKFGN VYLAREKKSH FIVALKVLFK SQIEKEGVEH 
QLHREIEIQA HLHHPNILRL YNYFYDRRRI YLILEYAPRG ELYKELQK3C TFDEQRTATI 
5 MEELADALMY CHGKKVIHRD IKPENLLLGL KGELKIADFG WSVHAPSLRR KTMCGTLDYL 
PPEMIEGRMH NEKVDLWCIG VLCYELLVGN PPFESASHNE TYRRIVKVDL KFPASVPTGA 
QDLISKLLRH NPSERLPLAQ VSAHPWVRAN SRRVLPPSAL QSVA 

Seq ID NO; 482 DNA sequence 
10 Nucleic Acid Accession #: AK055SS3 
Coding sequence: 38.. 1423 



PCT/US02/12476 



CCGAAGGTCC 
GCTTATGTGG 
TTTTGATCTT 
TAGCCCTGTC 
AGTCTTGGCA 
ACAGCCCGAG 
CCTGTTCACG 
TACGAGCTGG 



AGATCCTTTT 
TGGAAGATAC 
TGCAGTTCTA 
TTTAGTTTAA 
TATTCATTTG 
CAGTTGGGAG 
ATACACACGG 
ATGCTTT CTA 
CTTCAAGAGC 



CTGTGCAGCT 
TTGGCAAGTT 
TGCTCTTTGG 
CTAATAGTAT 



G GGGACAATTC ATCTCTTTCG 



AGCTTTAACT G 



ACTCATCAGA 



CTTTGTATTA 
GCTATAGCTA 
AAAGTCTTAC 
GAGGTATCTA 



TGAACAAATG 
TGTTCAAATT 
CAATGTCCTA 



ATTTAACACT 



GTTCTTGCTC 
TTCAAGGATG 
AACTTTTCAG 
CCAGTTACAT 
CCTGGGAAAA 
GGTCTCAATC 



TGGAGTTCCA 
TAGATATGGA 
TTATAAGGAA 
TGTTTAATCA 
TATGAAACTA 
GCTTTAAATA 
GITTTGTAGT TGACTGCAGT G 



GGTTTGAAAG 
CTCTCTTTAT 
GAAGATTATT 
TTCGGAATAA 
ATGTTGCAGA 
TTCCCCGAAT 
CATATATGCT 
TTGCCTTGAT 
TCCAGACAAC 
CCTTAGATGG 
TGGCTGGATC 
ATGIGACCAA 
ACTGGATTAG 
ATCATCACGT 
CAACTCCAGC 
ATGTGAACCC 
ATGGACACAC 
CAACTCAAGG 



ATTAGAAGTC 
ATTAAAAGAA 
AGTTGGTACT 
ACCTTTTGCT 
TCTTAGTCGA 
GAATCCATTT 
CATTGAAATT 
GACATTTGGC 
ACCACCCCAT 
AGTTTTAGAA 
AGTGCATGTA 
CAGGCIGTAC 
GCCTGCCTTA 
AATCCCAATG 
TAAACCTAGT 



CTGGCTGTAT 
AGTGCAGAAC 
TTTGTGGCTC 
TATGTCTCAG 
AGCTTGTGTG 
jTTTTGATTG 



CTGGCTTCCT 
ACCTGACCAT 
TGAGGAAACC 
TTGCCTCCAC 
GCTTTTTGGA 
TTTGTTTCAA 



ACTATGTATC 
GTTATTGGTC 
GTCCGAAATG 
AGAATTCGAC 
ACTCTAGTGT 



CCATGAGTGT 



A ATTTATTTAG 



TGTTCACATT 
TTTGGATTTT 



CTGTTGCAGC 
AGGGTACTGA 
CAGAATTTTC 
AAACAAGGCC 
ATCAAGGACT 
ATATACCAAG 
TAACTTATTT 
TTGCATTGAC 
TCATGAAACC 
AATGTTAAAG 



TTGATGGAGT 



T GTTTTTTGAG 



.920 

TGCCTCAGCC TCCCGAGTAG CTGGGATTAC AGGCACCTGC CACCACGCCC AGCTAATTTT 2 040 

! GGGATTTCAC CATGTTGGCC AGGCTGGTCT TGAACTCCTG 2100 

' TAGCCTCCCA AAGTGCTGGG ATTAGGTGTG AGCCACCGCA 2 ISO 

i ATGAAATTTA TAAATATGCT TCTTGAATAA TACACATTTT 2220 

GGGAAAGGGA AAAATGTCTG TTCAAAAAGT AAAGGTCTCT TTTATAGCTT TTCCAAACTT 2280 

AATTGCTAAA TTTTTCTTTG AGGTTCTCCT GAATTATGTC TTACAAACTA AAAGCAAAAA 2 340 

TTTTTAGCAG AAATTTTGGA ATACATTCTA TCTAGCACAA TTTGAATTTT TAATTATCAA 2400 
GATTTTTGTT AAAGTTTCTC TCCTTTAAAA ATTTTAGTAC ATTTGTAAAT 



Seq I 



I 



I 



I 



I 



MGTIHLFRKP QRSFFGKLLR EFRLVAADRR SWKILLFGVI NLICTGFLLM WCSSTNSIAL 

TAYTYLTIFD LFSLMTCLIS YWVTLRKPSP VYSFGFERLE VLAVFASTVL AQLGALFILK 

ESAERPLEQP EIHTGRLLVG TFVALCFNLF TMLSIRNKPF AYVSEAASTS WLQEKVADLS 

RSLCGIIPGL SSIFLPRMNP FVLIDLAGAF ALCITYMLIE INNYFAVDTA SAIAIALMTF 

GTMYPMSVYS GKVLLQTTPP HVIGQLDKLI REVSTLDGVL EVRNEHFWTL GFGSLAGSVH 

VRIRRDANEQ MVLAHVTNRL YTLVSTLTVQ IFKDDWIRPA LLSGPVAANV LNFSDHKVIP 

MPLLKGTDDL NPVTSTPAKP SSPPPEFSFN TPGKNVNPVI LLNTQTRPYG FGLNHGHTPY 
SSMLNQGLGV PC 

Seq ID NO: 4( 
Nucleic Acid 
Coding sequenci 



C GGGAGCTGAG 



AGAGCAGCCT 
CGCTGGACAC 
GGGCTGTGCA 
GAGGAGACTG 
CAGGCTCTCC 
TTGAGCTCTG 
CGGATACCGA 
TCACAGAGAA 
TGGGCACCCA 



CGAGGCCGAG 
CCCAGAGCTG 
CATCGTCAGC 
CTTCTCTGGT 
CCGGGGAGCT 
GAGCAGGCCC 

GGACACAGCG 
TGTCTTCCTG 
ATGGCTGCCC 
GGCCGACCTG 



ACGTACGTTC 



GACCCCTCCC 
GCGTGCTGGT GGGCGACGGC 
ATGGGTACCC CGCGCGCTAC 



GCGTGCTTCA 
GAGATCCGCA 
AGGGACGATG 



TCTCGGCGGG AGGGCGCAGA 
CTGGTGCGGC CCAGGACGCT 
AAGTCCTGGT GGATGGAGCT 
ATTTTGACCG ACTTCGTTCC 
GCGTGGTGCA GCCCAGCTCC 
CGCACAACCC CCAGGCGCCT 
AATTCAGCTG 
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GACCAGGGGG GCCGGGAGGG CCCCGTGCCC CAACCCCAGG CTCAGGGTCT GGCCGAGAAG 
3CTACCT TGAGTGCTCA GCCTTGACGC AGAAGAACTT GAAGGAAGTA 
3 CTATTCTCAG TGCCATTGAG CACAAAGCCC GGCTGGAGAA GAAACTGAAT 
G TGCGCACCCT CTCCCGCTGC CGCTGGAAGA AGTTCTTCTG CTTCGTTTGA 

Protein sequence 



PCT/US02/12476 



? PRRRSAPPEL GIKCVLVGDG AVGKSSLIVS YTCNGYPARY 
G TYVQSPVRPR GCGGAVHRGA GAGVSAGGRR GPRGGDWSRP RGGAGAAQDA 
\ PAVQVLVDGA PVRIELWDTA GQEDPDRLRS LCYPDTDVFL ACFSWQPSS 
FQNITEKWLP EIRTHNPQAP VLLVGTQADL RDDVNVLIQL DQGGREGPVP QPQAQGLAEK 



D NO: 486 E 



CGGCCCACTG CGCTGGACAC 
ATTGAGCTCT GGGACACAGC 
CCGGATACCG ATGTCTTCCT 
ATCACAGAGA AATGGCTGCC 
GTGGGCACCC AGGCCGACCT 
GGCCGGGAGG GCCCCGTGCC 
TGCTGCTACC TTGAGTGCTC 
GCTATTCTCA GTGCCATTGA 
GTGCGCACCC TCTCCCGCTG 



CTTCTCTGTG 



GGCGTGCTTC 
CGAGATCCGC 
GAGGGACGAT 
CCAACCCCAG 
AGCCTTGACG 
GCACAAAGCC 
CCGCTGGAAG 

a sequence 



CCGCCCCCGC TCCGGGCCCC GACCCCTCCC 
GGCATCAAGT GCGTGCTGGT GGGCGACGGC 
TACACCTGCA ATGGGTACCC CGCGCGCTAC 
CAAGTCCTGG TGGATGGAGC TCCGGTGCGC 
GACTTCGTTC C 



AGCGTGGTGC AGCCCAGCTC CTTICAAAAC 
ACGCACAACC CCCAGGCGCC TGTGCTGCTG 
GTCAACGTAC TAATTCAGCT GGACCAGGGG 
GCTCAGGGTC TGGCCGAGAA GATCCGAGCC 
CAGAAGAACT TGAAGGAAGT ATTTGACTCG 
CGGCTGGAGA AGAAACTGAA TGCCAAAGGT 
AAGTTCTTCT GCTTCGTTTG A 



MPPRELSEAE PPPLRAPTPP PRRRSAPPEL GIKCVLVGDG AVGKSSLIVS YTCNGYPARY 
RPTALDTFSV QVLVDGAPVR IELWDTAGQE DFDRLRSLCY PDTDVFLACF SWQPSSFQN 
ITEKWLPEIR THNPQAPVLL VGTQADLRDD VNVLIQLDOG GREGPVPQPQ AQGLAEKIRA 
T QKNLKEVFDS AILSAIEHKA RLEKKLNAKG VRTLSRCRWK KFFCFV 



GGCACCGATT CGGGGCCTGC CCGGACTTCG CCGCACGCTG CAGAACCTCG C 



CACGATGGCA 
ACTGCAGCAG 
CCTCACCAAA 
ACAGTAAAAA 
ATTACCTACA 
GTTACTGAAG 
CCACCAGCTC 
ACTCAACCCA 
ACAACCGGTC 
AATACCACCC 
CCATCGTCAG 



CAACAGTACA 



TACTTCAACA 
AACCTTCTGT 
TCATATTATA 
CAAGGAATCA 



CCCTGGTCAC 
TTACAGTCGG 
ATACAGCTGG 
GTAACCAGAC 
AGAAGCCTGA 
GCACAGCTGC 
TCAAGACTGG 
GGATACAGCT 
TCGACCCCAA 
TGAATTTTCA 
TCAGTGAAGT 
AACATGCGGT 
AGAGCCTCCA 
TTGATTTTGA 



AGCAAAAGCA 
GGACATAAAA AAACCTGTCC AGCAACCAGC 
AAGATTCATG GATGGTCATA 
ACTACAAAAA 
ACACCCAACA 
GCCCCTTATT 



AACCCAGGCC 



ACTCACACAC 



AACCAGTTCA 
CACCCTTCCA 
TCAACCCACC 
ACCTGCCTCC 
AATTTATCAG 
GATTGTTCAA 
CGCAACGCAA 
GGGCGGATTT 
GGGAGCCTAT 
GGTGATGTTC 
GTTGTCAGCC 
AGATGACCAC 
GATTGGGGCC 



GCAACTTTAT 
CAIGCCCCAG 
ACGGTTCCTG 
GTTCTAAACG 
GACAAGGAGT 



GCCACACAAC 
CGATAGCACT 
GAACAACGGC 
GGCCCACCCT 
GAAGCAGACT 



TAAGCAAGCA 
AACAGCGGCC 
CACCAGCCCA 
AGCTCCTCCA 
CACCATCACC 



AGCTGCCCAC 660 



ACTGTGGCAC 
CATTTACCAA 
CAGATCCAGA 



GTGAATCTCA CATTTACCAA 



AATGAAGTGA 

GCCACTCAAA 
AGCCTTCAAA 



ATGAAAATAA 
GTCATGTGTG 



TGGAATTTAG 
GTGGGTCCTT 
ATTTAAGTTC 
AGTGAGCTGT 



CACCTGCAGG 
TTTGGAAATG 
ATCGTGGTTG 
TCATCTGGAT 
AGAACTCTTT 



ACCTCGGAGA 
CCGAAAATCC 
GGATGAAGAA 
G ACAG TTTAC 
CTTCAAGTGC 



ACCAGAGAAT 
CATCCCTTCC 
AAACCACCAT 



CTCGTCTGAC 
TATGGGTATG 
CTAATTGTTG 



CTTCTATTCA 



TTATAAACCA 



TTAACAAAGC 



CAGGCTGGAG 



TACAGTGGCA 
GCTTCAGCTT 

CAGGTGATCC 



TTGTTTTTGT 
CGATCTCGGC 
CCCGAGTAGC 



ACCCACCTCA 



GTAACTAATA 

GTAACATACA 
TTTTTGAGAC 
TTATGGCAAC 
TGGGATTACA 
TTTCACCATG 
GCCTCCCAAA 
TTAATCATCA 



AATATATGTA 
CTACTGTGTG 
TTATCAAATG 
TATTCCTGGT 



AAGTAGAATA 
GACTTTCAGT 



TTGGCCAGAC TGGTCTTGAA 
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GTTGTCTAAG TGTTTTTATG TAAAACCAAC AAAAAGAACA AATCAGCTTA TATTTTTTA? 2220 

CTTGATGACT CCTGCTCCAG AATTGCTAGA CTAAGAATTA GGTGGCTACA GATGGTAGAA 22 8 0 

CTAAACAATA AGCAAGAGAC AATAATAATG GCCCTTAATT ATTAACAAAG TGCCAGAGTC 2340 

TAGGCTAAGC ACTTTATCTA TATCTCATTT CATTCTCACA ACTTATAAGT GAATGAGTAA 2400 

ACTGAGACTT AAGGGAACTG AATCACTTAA ATGTCACCTG GCTAACTGAT GGCAGAGCCA 2460 

GAGCTTGAAT TCATGTTGGT CTGACATCAA GGTCTTTGGT CTTCTCCCTA CACCAAGTTA 252 0 

CCTACAAGAA CAATGACACC ACACTCTGCC TGAAGGCTCA CACCTCA7AC CAGCATACGC 2S80 

TCACCTTACA GGGAAATGGG TTTATCCAGG ATCATGAGAC ATTAGGGTAG ATGAAAGGAG 2 64 0 

AGCTTTGCAG ATAACAAAAT AGCCTATCCT TAATAAATCC TCCACTCTCT GGAAGGAGAC 2700 

TCAGGGGCTT TGTAAAACAT TAGTCAGTTG CTCATTTTTA TGGGATTGCT TAGCTGGGCT 2760 

GTAAAGATGA AGGCATCAAA TAAACTCAAA GTATTTTTAA ATTTTTTTGA TAATAGAGAA 2 82 0 

ACTTCGCTAA CCAACTGTTC TTTCTTGAGT GTATAGCCCC ATCTTGTGGT AACTTGCTGC 28 80 

TTCTGCACTT CATATCCATA TTTCCTATTG TTCACTTTAT TCTGTAGAGC AGCCTGCCAA 2 940 

GAATTTTATT TCTGCTGTTT TTTTTGCTGC TAAAGAAAGG AACTAAGTCA GGATGTTAAC 3000 

AGAAAAGTCC ACATAACCCT AGAATTCTTA GTCAAGGAAT AATTCAAGTC AGCCTAGAGA 3 060 

CCATGTTGAC TTTCCTCATG TGTTTCCTTA TGACTCAGTA AGTTGGCAAG GTCCTGACTT 312 0 
TAGTCTTAAT AAAACATTGA ATTGTAGTAA AGGTTTTTGC AATAAAAACT TACTTTGG 

Seq ID NO: 489 Protein sequence 
Protein Accession # : NP_055213.1 

1 11 21 31 41 51 

I I I I I I 

MPRQLSAAAA LFASLAVILH DGSOMRAKAF PETRDYSQPT AAATVQDIKK PVQQPAKQAP 60 

HQTLAARPMD GHITFQTAAT VKIPTTTPAT TKNTATTSPI TYTLVTTQAT PNNSHTAPPV 12 0 

TEVTVGPSLA PYSLPPTITP PAHTAGTSSS TVSHTTGNTT QPSNQTTLPA TLSIALHKST 180 

TGQKPDQPTH APGTTAAAHK TTRTAAPAST VPGPTLAPQP SSVKTGIYQV LNGSRLCIKA 240 

EMGIQLIVQD KESVFSPRRY FNIDPNATQA SGNCGTRKSN LLLNFQGGFV NLTFTKDEES 300 

YYISEVGAYL TVSDPETVYQ GIKHAWMPQ TAVGHSFKCV SEQSLQLSAH LQVKTTDVQL 360 
QAFDFEDDHF GNVDECSSDY TIVLPVIGAI WGLCLMGMG VYKIRLRCQS SGYQRI 

Seq ID NO: 4 90 DNA sequence 

Nucleic Acid Accession ft: NM_00S409.3 

Coding sequence: 94.. 378 



1 11 21 31 41 51 

I I I I I I 

TTCCTTTCAT GTTCAGCATT TCTACTCCTT CCAAGAAGAG CAGCAAAGCT GAftGTAGCAG 60 

CAACAGCACC AGCAGCAACA GCAAAAAACA AACATGAGTG TGAAGGGCAT GGCTATAGCC 120 

40 TTOGCTGTGA TATTGTGTGC TACAGTTGTT CAAGGCTTCC CCATGTTCAA AAGAGGACGC 180 

TGTCTTTGCA TAGGCCCTGG GGTAAAAGCA GTGAAAGTGG CAGATATTGA GAAAGCCTCC 24 0 

ATAATCTACC CAAGTAACAA CTGTGACAAA ATAGAAGTGA TTATTACCCT GAAAGAAAAT 300 

AAAGGACAAC GATGCCTAAA TCCCAAATCG AAGCAAGCAA GGCTTATAAT CAAAAAAGTT 3 SO 

GAAAGAAAGA ATTTTTAAAA ATATCAAAAC ATATGAAGTC CTGGAAAAGC- GCATCTGAAA 42 0 

45 AACCTAGAAC AAGTTTAACT GTGACTACTG AAATGACAAG AATTCTACAG TAGGAAACTG 480 

AGACTTTTCT ATGGTTTTGT GACTTTCAAC TTTTGTACAG TTATGTGAAG GATGAAAGGT 540 

GGGTGAAAGG ACCAAAAACA G AAAT AC AG T CTTCCTGAAT GAATGACAAT CAGAATTCCA 60 0 

CTGCCCAAAG GAGTCCAGCA ATTAAATGGA TTTCTAGGAA AAGCTACCTT AAGAAAGGCT 660 

GGTTACCATC GGAGTTTACA AAGTGCTTTC ACGTTCTTAC TTGTTGTATT ATACATTCAT 72 0 

50 GCATTTCTAG GCTAGAGAAC CTTCTAGATT TGATGCTTAC AACTATTCTG TTGTGACTAT 780 

GAGAACATTT CTGTCTCTAG AAGTTATCTG TCTGTATTGA TCTTTATGCT ATATTACTAT 840 

CTGTGGTTAC AGTGGAGACA TTGACATTAT TACTGGAGTC AAGCCCTTAT AAGTCAAAAG 90 0 

CATCTATGTG TCGTAAAGCA TTCCTCAAAC ATTTTTTCAT GCAAATACAC ACTTCTITCC 960 

CCAAATATCA TGTAGCACAT CAATATGTAG GGAAACATTC TTATGCATCA TTTGGTTTGT 102 0 

55 TTTATAACCA ATTCATTAAA TGTAATTCAT AAAATGTACT ATGAAAAAAA TTATACGCTA 10 80 

TGGGATACTG GCAACAGTGC ACATATTTCA TAACCAAATT AGCAGCACCG GTCTTAATTT 1140 

GATGTTTTTC AACTTTTATT CATTGAGATG TTTTGAAGCA ATTAGGATAT GTGTGTITAC 12 00 

TGTACTTTTT GTTTTGATCC GTTTGTATAA ATGATAGCAA TATCTTGGAC ACATTTGAAA 1260 

TACAAAATGT TITTGTCTAC CAAAGAAAAA TGTTGAAAAA TAAGCAAATG TATACCTAGC 1320 

60 AATCACTTTT ACTTTTTGTA ATTCTGTCTC TTAGAAAAAT ACATAATCTA ATCAATTTCT 1380 

TTGTTCATGC CTATATACTG TAAAATTTAG GTATACTCAA GACTAGTTTA AAGAATCAAA 1440 
GTCATTTTTT TCTCTAATAA ACTACCACAA CCTTTCTTTT TTAAAAAAAA AAA 



Seq ID NO: 492 DNA sequence 

Nucleic Acid Accession NM_000577.1 

Coding sequence: 41.. 520 

1 11 21 31 41 51 

I I I I I I 

GGCACGAGGG GAAGACCTCC TGTCCTATCA GGCCCTCCCC ATGGCTTTAG AGACGATCTG 
CCGACCCTCT GGGAGAAAAT CCAGCAAGAT GCAAGCCTTC AGAATCTGGG ATGTTAACCA 
GAAGACCTTC TATCTGAGGA ACAACCAACT AGTTGCCGGA TACTTGCAAG GACCAAATGT 
CAATTTAGAA GAAAAGATAG ATGTGGTACC CATTGAGCCT CATGCTCTGT TCTTGGGAAT 
CCATGGAGGG AAGATGTGCC TGTCCTGTGT CAAGTCTGGT GATGAGACCA GACTCCAGCT 
GGAGGCAGTT AACATCACTG ACCTGAGCGA GAACAGAAAG CAGGACAAGC GCTTCGCCTT 
CATCCGCTCA GACAGTGGCC CCACCACCAG TTTTGAGTCT GCCGCCTGCC CCGGTTGGTT 
CCTCTGCACA GCGATGGAAG CTGACCAGCC CGTCAGCCTC ACCAATATGC CTGACGAAGG 
CGTCATGGTC ACCAAATTCT ACTTCCAGGA GGACGAGTAG TACTGCCCAG GCCTGCCTGT 
TCCCATTCTT GCATGGCAAG GACTGCAGGG ACTGCCAGTC CCCCTGCCCC AGGGCTCCCG 
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GCTATGGGGG CACTGAGGAC CAGCCATTGA GGGGTGGACC CTCAGAAGGC GTCACAACAA 660 

CCTGGTCACA GGACTCTGCC TCCTCTTCAA CTGACCAGCC TCCATGCTGC CTCCAGAATG 720 

GTCTTTCTAA TGTGTGAATC AGAGCACAGC AGCCCCTGCA CAAAGCCCTT CCATGTCGCC 7 80 

TCTGCATTCA GGATCAAACC CCGACCACCT GCCCAACCTG CTCTCCTCTT GCCACTGCCT 840 

CTTCCTCCCT CATTCCACCT TCCCATGCCC TGGATCCATC AGGCCACTTG ATGACCCCCA 900 

ACCAAGTGGC TCCCACACCC TGTTTTACAA AAAAGAAAAG ACCAGTCCAT GAGGGAGGTT 950 

TTTAAGGGTT TGTGGAAAAT GAAAATTAGG ATTTCATGAT TTTTTTTTTT CAGTCCCCGT 102 0 

GAAGGAGAGC CCTTCATTTG GAGATTATGT TCTTTCGGGG AGAGGCTGAG GACTTAAAAT 1030 

ATTCCTGCAT TTGTGAAATG ATGGTGAAAG TAAGTGGTAG CTTTTCCCTT CTTTTTCTTC 1140 

TTTTTTTGTG ATGTCCCAAC TTGTAAAAAT TAAAAGTTAT GGTACTATGT TAGCCCCATA 12 00 

ATTTTTTTTT TCCTTTTAAA ACACTTCCAT AATCTGGACT CCTCTGTCCA GGCACTGCTG 12S0 

CCCAGCCTCC AAGCTCCATC TCCACTCCAG ATTTTTTACA GCTGCCTGCA GTACTTTACC 132 0 
TCCTATCAGA AGTTTCTCAG CTCCCAAGGC TCTGAGCAAA TGTGGCTCCT G 
T AAATTGCTCC TTGACATTGT AGAGCTTCTG G 

3 TGCCTCTGCC TGTCTCCCCC ACCAGGCTGG GAGCTCTGCA 1500 

GAGCAGGAAA CATGACTCGT ATATGTCTCA GGTCCCTGCA GGGCCAAGCA CCTAGCCTCG 15 SO 

CTCTTGGCAG GTACTCAGCG AATGAATGCT GTATAIGTTG GGTGCAAAGT TCCCTACTTC 1S20 

CTGTGACTTC AGCTCTGTTT TACAATAAAA TCTTGAAAAT GCCTAAAAAA AAAAAAAAAA 1680 



MALETICRPS GRKSSKMQAF RIWDVNQKTF YLRNNQLVAG YLQGPNVNLE E 
HALFLGIHGG KMCLSCVKSG DETRLQLEAV NITDLSENRK QDKRFAFIRS DSGPTTSFES 
AACPGWFLCT AMEADQPVSL TNMPDEGVMV TKFYFQEDE 



A GCGAGCGTTC GGACCTCGCA CCCCGCGCGC CCCGCGCCGC C 



GGCTTTTGTT GTCTCCGCCT CCTCGGCCGC CGCCGCCTCT GGACCGCGAG CCGCGCGCGC 12 0 

CGGGACCTTG GCTCTGCCCT TCGCGGGCGG GAACTGCGCA GGACCCGGCC AGGATCCGAG 180 

AGAGGCGCGG GCGGGTGGCC GGGGGCGCCG CCGGCCCCGC CATGGAGCTC CGGGCCCGAG 240 

GCTGGTGGCT GCTATGTGCG GCCGCAGCGC TGGTCGCCTG CGCCCGCGGG GACCCGGCCA 3 00 

GCAAGAGCCG GAGCTGCGGC GAGGTCCGCC AGATCTACGG AGCCAAGGGC TTCAGCCTGA 3 SO 

GCGACGTGCC CCAGGCGGAG ATCTCGGGTG AGCACCTGCG GATCTGTCCC CAGGGCTACA 420 

CCTGCTGCAC CAGCGAGATG GAGGAGAACC TGGCCAACCG CAGCCATGCC GAGCTGGAGA 480 

CCGCGCTCCG GGACAGCAGC CGCGTCCTGC AGGCCATGCT TGCCACCCAG CTGCGCAGCT 540 

TCGATGACCA CTTCCAGCAC CTGCTGAACG ACTCGGAGCG GACGCTGCAG GCCACCTTCC 60 0 

CCGGCGCCTT CGGAGAGCTG TACACGCAGA ACGCGAGGGC CTTCCGGGAC CTGTACTCAG 6S0 

AGCTGCGCCT GTACTACCGC GGTGCCAACC TGCACCTGGA GGAGACGCTG GCCGAGTTCT 72 0 

GGGCCCGCCT GCTCGAGCGC CTCTTCAAGC AGCTGCACCC CCAGCTGCTG CTGCCTGATG 780 

ACTACCTGGA CTGCCTGGGC AAGCAGGCCG AGGCGCTGCG GCCCTTCGGG GAGGCCCCGA 840 

GAGAGCTGCG CCTGCGGGCC ACCCGTGCCT T TGC T 3CTCCTTT GTGCAGGGCC 9D0 

TGGGCGTGGC CAGCGACGTG GTCCGGAAAG TGGCTCAGGT CCCCCTGGGC CCGGAGTGCT 9S0 

CGAGAGCTGT CATGAAGCTG GTCTACTGTG CTCACTGCCT GGGAGTCCCC GGCGCCAGGC 1020 

CCTGCCCTGA CTATTGCCGA AATGTGCTCA AGGGCTGCCT TGCCAACCAG GCCGACCTGG 1080 

ACGCCGAGTG GAGGAACCTC CTGGACTCCA TGGTGCTCAT CACCGACAAG TTCTGGGGTA 1140 

CATCGGGTGT GGAGAGTGTC ATCGGCAGCG TGCACACGTG GCTGGCGGAG GCCATCAACG 1200 

CCCTCCAGGA CAACAGGGAC ACGCTCACGG CCAAGGTCAT CCAGGGCTGC GGGAACCCCA 12 SO 

AGGTCAACCC CCAGGGCCCT GGGCCTGAGG AGAAGCGGCG CCGGGGCAAG CTGGCCCCGC 1320 

GGGAGAGGCC ACCTTCAGGC ACGCTGGAGA AGCTGGTCTC TGAAGCCAAG GCCCAGCTCC 1380 

GCGACGTCCA GGACTTCTGG ATCAGCCTCC CAGGGACACI GTGCAGTGAG AAGATGGCCC 1440 

TGAGCACTGC CAGTGATGAC CGCTGCTGGA ACGGGATGGC CAGAGGCCGG TACCTCCCCG 1500 

AGGTCATGGG TGACGGCCTG GCCAACCAGA TCAACAACCC CGAGGTGGAG GTGGACATCA 15 SO 

CCAAGCCGGA CATGACCATC CGGCAGCAGA TCATGCAGCT GAAGATCATG ACCAACCGGC 1620 

TGCGCAGCGC CTACAACGGC AACGACGTGG ACTTCCAGGA CGCCAGTGAC GACGGCAGCG 1680 

GCTCGGGCAG CGGTGATGGC TGTCTGGATG ACCTCIGCGG CCGGAAGGTC AGCAGGAAGA 1740 

GCTCCAGCTC CCGGACGCCC TTGACCCATG CCCTCCCAGG CCTGTCAGAG CAGGAAGGAC 1800 

AGAAGACCTC GGCTGCCAGC TGCCCCCAGC CCCCGACCTT CCTCCTGCCC CTCCTCCTCT 1360 

TCCTGGCCCT TACAGTAGCC AGGCCCCGGT GGCGGTAACT GCCCCAAGC-C CCCAGGGACA 1920 

GAGGCCAAGG ACTGACTTTG CCAAAAATAC AACACAGACG ATATTTAATT CACCTCAGCC 1980 

TGGAGAGGCC TGGGGTGGGA CAGGGAGGGC CGGCGGCTCT GAGCAGGGGC AGGCGCAGAG 2 040 

GTCCCAGCCC CAGGCCTGGC CTCGCCTGCC TTTCTGCCTT TTAATTTTGT ATGAGGTCCT 2100 

CAGGTCAGCT GGGAGCCAGT GTGCCCAAAA GCCATGTATT TCAGGGACCT CAGGGGCACC 2 ISO 

CTACAGAGGA GGCCTCAAAG CAACCCGCTG GAGCCCACAG CGAGCCTGTG CCTTCCTCCC 2 2 80 

CGCCTCCTCC CACTGGGACT CCCAGCAGAG CCCACCAGCC AGCCCTGGCC CACCCCCCAG 2 340 

CCTCCAGAGA AGCCCCGCAC GGGCTGTCTG GGTGTCCGCC ATCCAGGGTC TGGCAGAGCC 24 0 0 

TCTGAGATGA TGCATGATGC CCTCCCCTCA GCGCAGGCTG CAGAGCCCC-G CCCCACCTCC 2460 

CTGCGCCC1T GAGGGGCCCC AGCGTCTGCA GGGTGACGCC TGAGACAGCA CCACTGCTGA 2 52 0 

GGAGTCTGAG GACTGTCCTC CCACAGACCC TGCAGTGAG^ GG C( Tl ZAJ GCGCAGATGA 2580 

GGGGCCACTG ACCCACCTGC GCTTCTGCTG GAGGAGGGGA AGCTGGGCCC AAAGGCCCAG 2 640 

GGAGGCAGCG TGGGCTCTGC CAATGTGGGC TGCCCCTCGC ACACAGGGCT CACAGGGCAG 2 700 

GCCTTGCTGG GGTCCAGGGC TGTTGGAGGA CCCCGAGGGC TGAGGAGCAG CCAGGACCCG 2 760 

CCTGCTCCCA TCCTCACCCA GAT CAGGAAC CAGGGCCTCC CTGTTCACGG TGACACAGGT 2 820 

CAGGGCTCAG AGTGACCCTC GGCTGTCACC TGCTCACAGG GATGCTGGTG GCTGGTGAGA 2 880 

CCCCGCACTG CACACGGGAA TGCCTAGGTC CCTTCCCGAC CCAGCCAGCT GCACTGCAGG 2 940 

f AAGGGCTTTT CCAAACATGC ATCCATTTAC TGACACTTCC 3 000 
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TCCTGAACCG ACTGACCCTG AGGAGGCCGC TTAGTGCTGC TTTGCTTTTC ATCACCGTCC 3300 

CGCACAGTGG ACGGAGGTCC CCGGTTGCTG GTCAGGTCCC CATGGCTTGT TCTCTGGAAC 3360 

CTGACTTTAG ATGTTTTGGG ATCAGGAGCC CCCAACACAG GCAAGTCCAC CCCATAATAA 342 0 

CCCTGCCAGT GCCAGGGTGG GCTGGGGACT CTGGCACAGT 1ATGC1 I ( CA3GACAG 3480 

CAGCACTCCC GCTGCACACA GACGGCCTAG GGGTGGCGCT CAGACCCCAC CCTACGCTCA 3S40 

TCTCTGGAAG GGGCAGCCCT GAGTGGTCAC TGGTCAGGGC AGIGGCCAAG CCTGCTGTGT 3600 

CCTTCCTCCA CAAGGTCCCC CCACCGCTCA GTGTCAGCGG GTGACGTGTG TTCTTTTGAG 3660 
TCCTTGTATG AATAAAAGGC TGGAAACCTA AA 



I I I I I I 

MELRARGWWL LCAAAALVAC ARGDPASKSR SCGEVRQIYG AKGFSL3DVP QAEISGEHLR 
ICPQGYTCCT SEMEENLANR SHAELETALR DSSRULQAML ATQLRSFDDH FQHLLNDSER 
TLQATFPGAF GELYTQNARA PRDLYSELRL YYRGANLHLE ETLAEPWARL LERLFKQLHP 
QLLLPDDYLD CLGKQAEALR PFGEAPRELR LRATRAFVAA RSFVQGLGVA SDWRKVAQV 
PLGPECSRAV MKLVYCAHCL GVPGARPCPD YCRNVLKGCL ANQADLDAEW RNLLDSMVLI 
TDKFKGTSGV ESVIGSVHTW LAEAINALQD NRDTLTAKVI QGCGNPKVNP QGPGPEEKRR 
RGKLAPRERP PSGTLEKLVS EAKAQLRDVQ DFWISLPGTL CSEKMALSTA SDDRCWNGMA 
RGRYLPEVMG DGLANQ INNP EVEVDITKPD MTIRQQIMQL KIMTNRLRSA YNGNDVDFQD 



I I I I I I 

GGGGCAGGCA ATGAGAGCTG CACTCTGGCT GGGGAAGGCA TGAGTGACAG ACCCACAGCA 60 

AGGCGGTGGG GTAAGTGTGG ACCTTTGTGT ACCAGAGAGA ACATCATGGT GGCTTTCAAA 12 0 

GGGGTCTGGA CTCAAGCTTT CTGGAAAGCA GTCACAGCGG AATTTCTGGC CATGCTTATT 18 0 

TTTGTTCTCC TCAGCCTGGG ATCCACCATC AACTGGGGTG GAACAGAAAA GCCTTTACCG 240 

GTCGACATGG TTCTCATCTC CCTTTGCTTT GGACTCAGCA TTGCAACCAT GGTGCAGTGC 300 

TTTGGCCATA TCAGCGGTGG CCACATCAAC CCTGCAGTGA CTGTGGCCAT GGTGTGCACC 360 

AGGAAGATCA GCATCGCCAA GTCTGTCTTC TACATCGCAG CCCAGTCCCT GGGGGCCATC 420 

ATTGGAGCAG GAATCCTCTA TCTGGTCACA CCTCCCAGTG TGGTGGGAGG CCTGGGAGTC 480 

ACCATGGTTC ATGGAAATCT TACCGCTGGT CATGGTCTCC TGGTTGAGTT GATAATCACA 54 0 

TTTCAATTGG TGTTTACTAT CTTTGCCAGC TGTGATTCCA AACGGACTGA TGTCACTGGC 60 0 

TCAATAGCTT TAGCAATTGG ATTTTCTGTT GCAATTGGAC ATTTATTTGC AATCAATTAT 66 0 

ACTGGTGCCA GCATGAATCC CGCCCGATCC TTTGGACCTG CAGTTATCAT GGGAAATTGG 72 0 

GAAAACCATT GGATATATTG GGTTGGGCCC ATCATAGGAG CTGTCCTCGC TGGTCGCCTT 7 80 

TATGAGTATG TCTTCTGTCC AGATGTTGAA TTCAAACGTC GTTTTAAAGA AGCCTTCAGC 64 0 

AAAGCTGCCC AGCAAACAAA AGGAAGCTAC ATGGAGGTGG AGGACAACAG GAGTCAGGTA 90 0 

GAGACGGATG ACCTGATTCT AAAACCTGGA GTGGTGCATG TGATTGACGT TGACCGGGGA 96 0 

GAGGAGAAGA AGGGGAAAGA CCAATCTGGA GAGGTATTGT CTTCAGTATG ACTAGAAGAT 10 2 0 

CGCACTGAAA GCAGACAAGA CTCCTTAGAA CTGTCCTCAG ATTTCCTTCC ACCCATTAAG 1080 

GAAACAGATT TGTTATAAAT TAGAAATGTG CAGGTTTGTT GTTTCATGTC ATATTACTCA 1140 

GTCTAAACAA TAAATATTTC ATAATTTACA AAGGAGGAAC GGAAGAAACC TATTGTGAAT 1200 

TCCAAATCTA AAAAAAGAAA TATTTTTAAG ATGTTCTTAA GCAAATATAT ACCTATTTIA 1260 

TCTAGTTACC TTTCATTAAC AACCAATTTT AACCGTGTGT CAAGATTTGG TTAAGTCTTG 132 0 

CCTGACAGAA CTCAAAGACA CGTCTATCAG CTTATTCCTT CTCTACTGGA ATATTGGTAT 1380 
AGTCAATTCT TATTTGAATA TTTATTCTAT TAAACTGAGT TTAACAATGG C 

Seq ID NOi 497 Protein sequence 
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MSDRPTARRW GKCGPLCTRE NIMVAFKGVW TQAFWKAVIA EFLAMLIFVL LSI 

GTEKPLPVDM VLISLCFGLS IATMVQCFGH ISGGHINPAV TVAMVCTRKI SIAKSVFYIA 

AQCLGAIIGA GILYLVTPPS WGGLGVTMV HGNLTAGHGL LVELI ITFQL VFTIFASCDS 

KRTDVTGSIA LAIGFSVAIG HLFAINYTGA SMNPARSFGP AVIMGNWENH WIYWVGPIIG 

AVLAGGLYEY VFCPDVEFKR RFKEAFSKAA QQTKGSYMEV EDNRSQVETD DLILKPGWH 

i KGKDQSGEVL SSV 



Seq ID NO: 498 DNA sequence 



1 11 21 31 41 51 

i i I I I I 

CCCCCTTGTC ATTAATACAT TAAAAAGATT CAATCTTTAC CCTGAGGTAA TTTTC-GCCAG 
TTGGTACCGG ATTTATACCA AAATAATGGA CTTGATTGGT ATTCAAACCA AGATATGTTG 
GACGGTTACC AGAGGAGAAG GACTCAGTCC TATTGAAAGC TGTGAAC-GAT TGGGAGATCC 
TGCTTGCTTT TATGTTGCTG TAATTTTTAT TTTAAATGGA CTAATGATGG CATTATTCTT 
CATATATGGC ACATATTTAA GTGGCAGCCG ATTAGGAGGC CTGGTTACAG TGTTGTGCTT 
CTTTTTCAAT CATGGAGAGT GTACCCGTGT AATGTGGACA CCACCTCTCC GTGAAAGCTT 
CTCATATCCA TTTCTTGTTC TTCAGATGTT GCTAGTGACT CATATTCTCA GGGCTACAAA 
ACTTTATAGA GGAAGCTTGA TTGCACTCTG CATTTCCAAT GTATTTTTCA TGCTTCCTTG 
GCAGTTTGCT CAGTTTGTAC TTCTTACTCA GATTGCATCA TTATTTGCAG TATATGTTGT 
CGGGTACATT GATATATGTA AATTACGGAA GATCATTIAT ATACACATGA TTTCTCTTGC 
AC1TTGTTTT GTTTTGATGT TTGGGAACTC AATGTTATTA ACTTCTTATT ATGCTTCTTC 
TTTGGTAATT ATTTGGGGTA TTCTGGCAAT GAAACCACAT TTCCTGAAAA TAAATGTATC 
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TGAACTTAGT TTATGGGTTA TTCAAGGATG TTTTTGGTTA TTTGGAACTG TCATACTTAA 7 80 

A TCTAAAATTT TTGGTATTGC AGATGACGCT CATATTGGCA ACTTACTAAC 84 0 

C TTTAGTTATA AGGATTTTGA TACTTTATTG TATACCTGTG CAGCGGAGTT 900 
3 GAAAAAGAGA CTCCACTGAG ATACACAAAG A 
TCTTGTAGTG TTTGTTGCTA TTGTTAGAAA GATTATTAGT GATATGTGGG G 

ATTGCAATTG TTAGCATATA CAGCCCTTGG TATTTTAATT ATGAGACTAA AACTCTTCTT 
GACACCACAC ATGTGTGTTA TGGCATCACT GATCTGCTCA AGACAC 1 1 ATCGCI 
CTTTTGCAAA GTACATCCTG GTGCTATTGT GTTTGCTATA TTAGCAGCAA TGTCAATACA 
AGGTTCAGCA AATCTGCAAA CCCAGTGGAA TATTGTAGGG GAGTTCAGCA ATTTGCCCCA 
AGAAGAACTT ATAGAATGGA TCAAATATAG TACTAAJ 
CATGCCCACG ATGGCAAGTG TTAAGCTCTC TGCACTICGG CCC 

TTATGAAGAC GCAGGCTTGA GAGCCAGAAC AAAAATAGTA TACTCAATGT ATAGTCGGAA 150 0 

AGCAGCCGAA GAAGTGAAGC GAGAACTGAT AAAGTTAAAA GTGAACTATT ACATTCTAGA 1560 

AGAGTCATGG TGTGTAAGAA GATCCAAGCC TGGTTGCAGT ATGCCTGAAA TTTGGGATGT 162 0 

AGAAGATCCT GCCAATGCTG GGAAAACTCC CTTATGTAAC CTCTTGGTGA AGGATTCCAA 1680 

ACCTCACTTC ACCACTGTAT TCCAGAACAG TGTTTACAAA GTCCTAGAAG TTGTAAAAGA 1740 

ATGACTGCTA CATGACCTGC TGCCTACGGA GAACTACATC TGTAATGGTT TTAATGTTTT 1800 

GCTAAGTCAT GTGTTGTTCA TATCCCAAAA ACTTTTATAG GTAACTGTTT TCAAATAGAA 1860 

AACGTTTTAT TTGGTCAATT TGAATGTCAT TCTAATTATA AAAATGACTT ACACCTTTAT 192 0 

CAATTGGTTA CTATTTCAAT GCACCCTTTA AAATTTGCTA TGCAAATGAG TATATGCTTG 1980 

TACTTGACTT TAATATTTGT GCTAAAGTGA GCAAAGCTAC CTGTATAAAG AAAACACAGT 2 04 0 

GGGTTGTGAC AAGGATGACA TGAAAATACA GGACAATTCT GACAATGTAG GGGCTGATTT 210 0 

TATAGTGTAA GAACTATTAA TGCCCCTTGC TTCTTTTTTC TGCCTCTTGC TCTTGTCTTT 2160 

TGGACATTTC AGIGATTGTA AGTTCTTCGG TCATGTCAGC CCCTGTCATC AACTTGAGTT 2220 

ACAGTAGATG GGGCAGACAT GGAGTGTTTG CTATATAAAA CTATCTGTTT GTTTTACTTC 22 80 

CTTGTGCGCT TTTTGTTCTC TGTTCTCTTG TTAATGAAGC TTTTCCTGCC CATTATTAAT 2340 

CCAAACTCTT GGACCTTGTG GTTAGGAAAT TCCCTTAACT TCCAGCCATA TGGCATTATC 2400 

GTGTCTCTTT CTCTCTCTCT CTTGCTCTCT CTCTTCTCCT CTTCCCCATA TTTTCTGTCA 2460 

AATAAGTACT GTTTACTCAT TTAGTTGCTT ATCAAGTACT TAT7CTTGGT TTTAAAAAAA 252 0 

ATTAATGGTA ACTGTATTTT TCTCATTTTT AGCATTATTC AAATGTTTAT ATTTTAATAC 2580 

CTTTAAACCA CTTTAAAGTT TTTTCATGTT TAATTATAGT TTTAAGAAAA ACTATTTTGA 2640 

ACAACCCCAA ATATAGTGCA TCTAGAAACT AATGTATATT TGATTAGACA TCATTTATAG 27 0 0 

TGGAACAGTA GACTGTAGTA CATGGTAATT TTTCTTTTAC TATTAAGATA CAATAAAACA 2760 

TGACTAATTT TGCTGTCAAA AATGTAAAGA ATAATGATAA ATGGAGTTTT TTATATTTTA 2820 

CTTTTAAGAT TGCCTGTCTT TAATAAGACA AAGCCTTAAG CCTTATGTTA TAATTTTGGT 2 8 80 

TCTAAAAACC ATCATTTCAG TATAAGGAAT AAGTATATTT CGTCCTCCTC TTTAGTTTTT 2 940 

TTCTTCCTAT TTATTTTTAT TTTGAAAAAT TTCTACACCT TCTTTGAATT CCTTGTATGA 3000 

ATTTTTGTTT CTTAGAAGTT AATTTGTGTG AAATGAGATT CTTCAAAACG ATGAAACCTC 3060 

ATAGCTCTGA GAAAAGGTTT TAGGGTTTTA AATTCTAAGC AAAGCGTGAC TATGGCTGAC 312 0 

AGACTACACA TTTAATTATA CAGCTTCTCT TTCTTAACCA CAGGCAGATT AACCTCATTG 3180 

TGGATTGTCC TTCAGACCTT AGTCCTCAGG CATGGTTTCT GGTGCCCACT CCTGGAAGCC 324 0 

GCTGTTCCCT TTCTACCTTC TTACCAGAGC CCAAGGGCAG GCCTGGTCCC GGGGAAGCAG 3300 

CAGCTTGCTG ACATAAGTCA GCTGCAAAGG CTGAGGAGTG TGCCCTCAGA GAAGCACCGC 33 60 

CCCCCAGTCT TGTGCCAGCG CCTAGAGCCG CAGCTCCCAG GGATGCTCCT TCCCTGGAGG 342 0 

CAGCCCAGGA GAGGGACTCT GGCAGCGTTC TTCAGATTTG TGGCCACTGT TTCTCATTTG 3480 

CTGGTTGACT GTTTTTATTT CTTAGGCTTT TGCTAGTTTT AGAAAATAGG GAAGCAGCCC 3 54 0 

TTGATTTGTG GATTAAAAGC AACATTTGAG CGATGATGCA CAACAGTCCA GGAAAATGGG 360 0 

CGGTGGACAC TTGAGGCTGA GGATGGGAGT TGACATGAGC AGGGAGAGGG AGGTGCGCGC 3660 

TGCTTATCTG TGATTGTTGC TCACCTGAGT GTGGCTGATT GTGTACATCC AGCAGTTACA 372 0 

ATTTTTAAAA ATTATACTTT TACATTTATT TTATATTTTT CTCACCCCCA GTAATTTCCT 3 78 0 

TCCAAAGAAG TTCACATGTA ATAAGTAGAA ATTCTGTATA GGAAAAAAGC ATTAAAAATA 3840 

CTATTATAAC TGCTTCATTT GCTGGGAACC ATTAAAAGTA ATATAAATTA GCTTTTTCCA 3900 

GAAGGATCCT TTTGTAGCAG TGTTTATGAA TGTAACCCCC ACCAAAATAT GGCIATATAT 3960 

TAGGGGAGCC AGTTTGGAGC AGAGGCCTGA AGGTCCCTGC TATGCAGCCG TGGCCACAGC 4020 

TCGCAGCCCA AGCACTGTGG AGCATCCACA CCTTTGATGG CAATGCAGAT TGGTAGCAGG 40 80 

TTCCATAGGC GTACAAAACA GTATTAAAGC TCAGTGTTTT GCATAITGTT AGCATTTACA 414 0 

AATATTTTTG CTTTAGTATG AGGAAAGTAA GGATGGGCAA AGAAGCGATC AAAATAGCTA 42 00 

TTGCTACAAC ATTTTCGAAA ACAAAGTTGG GGCTGTATTT CTTTAAAAAG ATAAGCCTCT 42 60 

AAAAATGCTT GGCAAAAAAA ATATAGTGTT AAAATAGGCC AGTGATATTA ATGAGAAAAT 432 0 

GAAAGTATGT ATCAGGAATA AAGTGATATI GCATAGGAGT ATTGTATTTT TATGAATTTT 43 80 
ATGCCAGTTG TTTACATGTA CTATATATGT TAAATTAAAA AAAATCATGA GAAATG 



I I I I I I 

PLVINTLKRP NLYPEVILAS WYRIYTKIMD LIGIQTKICW TVTRG3GLSP IESCEGLGDP 

ACFYVAVIFI LNGLMMALFF IYGTYLSGSR LGGLVTVLCF FFNHGECTRV MWTPPLRESF 

SYPFLVLQML LVTHILRATK LYRGSLIALC ISNVFFMLPW QFAQFVLLTQ IASLFAVYW 

GYIDICKLRK IIYIHMISLA LCFVLMFGNS MLLTSYYASS LVIIKGILAM KPHFLKINVS 

ELSLWVIQGC FWLFGTVILK YLTSKIFGIA DDAHIGNLLT SKFFSYKDFD TLLYTCAAEF 

DFMEKETPLR YTKTLLLPW LWFVAIVRK IISDMWGVLA KQQTHVRKHQ FDHGELVYHA 

LQLLAYTALG ILIMRLKLFL TPHMCVMASL ICSRQLFGKL FCKVHPGAIV FAILAAMSIQ 

GSANLQTQWN IVGEFSNLPQ EELIEWIKYS TKPDAVFAGA MPTMASVKLS ALRPIVNHPH 

YEDAGLRART KIVYSMYSRK AAEEVKRELI KLKVNYYILE ESWCVRRSKP GCSMPEIWDV 

EDPANAGKTP LCNLLVKDSK PHFTTVFQNS V 

Seq ID NO: 500 DNA sequence 
Coding sequence: 127.. 1278 
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GCCAGAATGG GTGTGAAGGC GTCTCAAACA GGCTTTGTGG TCCTGGTGCT C-CTCCAGTGC 

TGCTCTGCAT ACAAACTGGT CTGCTACTAC ACCAGCTGGT CCCAGTACCG GGAAGC-CGAT 

GGGAGCTGCT TCCCAGATGC CCTTGACCGC TTCCTCTGTA CCCACATCAT CTACAGCTTT 

GCCAATATAA GCAACGAT CA CATCGACACC TGGGAGTGGA ATGATGTGAC GCTCTACGGC 

ATGCTCAACA CACTCAAGAA CAGGAACCCC AACCTGAAGA CTCTCTTGTC TGTCGGAGGA 

TGGAACTTTG GGTCTCAAAG ATTTTCCAAG ATAGCCTCCA ACACCCAGAG TCGCCC-GACT 

TTCATCAAGT CAGTACCGCC ATTCCTGCGC ACCCATGGCT TTGATGGGCT C-GACCTTGCC 

TGGCTCTACC CTGGACGGAG AGACAAACAG CATTTTACCA CCCTA 



CCCACCTTCG 
ATCTGTGACT 



AGGTCACCAT 
GCATCATGAC 
TGTTCCGAGG 
GGTACATGTT 
GGAGGAGCTT 



GCCTCAGTCT 
GCCCTOGTGG 
GACTCGGGAT 



CAACGTAGCC 
CTCCAGCTGG 
CCCTCCCTTG 
GCAGAGAGGT 
TAGTACACAC 



GAGGCTGGGG GCTCCTGCCA 
CACTCTGGCT TCTTCTGAGA 
CCGGTTCACC AAGGAGGCAG 
AGCCACAGTC CATAGAACCC 
GGTAGGATAC GACGACCAGG 
GCTGGCAGGC GCCATGGTAT 



CTCTGTTCTG 



CTGGTGTTGG AGCCCCAATC 960 




TCGGCCAGCA 
AAAGCGTCAA 
GGGCCCTGGA 
TCACCAATGC 



AAGCAAGGTG 
CCTGGATGAC 
CATCAAGGAT 



GGGCCTATGC 
AGGGATGGGG 
TTGTTGATGA 



TGGCAAGCTC T 



ACTTCCCCTT 
CGCTTTGCTT 
TCTTCTGGGT 



GAGCCAAACA 
TGAAACCTTC 
CAGCTGCTCA 



T GAGCCTTGGG 



CTGTGGGGAT 
TTAATGGAAA 
CCTAGCCCTC 
TCCTACAAGA 
ACTTAGGAAC 
ATAAAGTACA 
ACTAGACCCA 
ACCCCTGAGC 



TGTTTACAGA 



CACAGTGACC 
GTAATCGTGT 
AGAGTTTAAC 
CTGGACTCAC 



TTTGAGCTCA 
CGCAATGTAA ' 
TCCCCAAGCC 
GACACCATTT 
ATACTAATTA 
CCCCTATCCT 
AGTGTGTTGG 
CTCCCCCATC 
GAAGGCCGCC 
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50 
55 
60 



I 



ii 



I 



MGVKASQTGP WLVLLOCCS AYKLVCYYTS WSQYREGDGS CFPDALDRFL CTHIIYSFAN 
ISNDHIDTWE WNDVTLYGML NTLKNRNPNL KTLLSVGGWN FGSQRFSKIA SNTQSRRTFI 
KSVPPFLRTH GFDGLDLAWL YPGRRDKQHF TTLIKEMKAE FIKEAOPGKK QLLLSAALSA 
GKVTIDSSYD IAKISQHLDF ISIMTYDFHG AWRGTTGHHS PLFRGQEDAS PDRFSNTDYA 
VGYMLRLGAP ASKLVMGIPT FGRSFTLASS ETGVGAPISG PGIPGRFTKE AGTLAYYEIC 
DFLRGATVHR TLGQQVPYAT KGNQWVGYDD QESVKSKVQY LKDRQLAGAM VWALDLDDFO 
GSFCGQDLRF P 




GCCTCAAACG 
GTTGAGAAAG 
GCCATCGGTT 
TCGCCCTAAA 
TTCTGACTCT 
CGGGCCCATT 



CAGGTGCCGA 
TGACAACTCT 
CAACTTCAGA 
TGGCCACCAG 
ATGGTTTGTC 
TCATTGGTGG 
GAGCTGAAGG 



GTGACTCCAG 
AGTGTCAACA 
CACGCGCAAG 



GCTCTTCGTT 
GCCAGAAGAT 
AGATGATGTG 
GGTGGCAACA 
AAGCACAGTC 
TCACTCCACG 
AACAGTGACC 
AATCATCGTT 
GTTACGCCCT 
CCCTGAGCTC 
C GGTGACTTTC CGTTTGCCAA 
A ACTTT 



GAACCAGCGA 
GTGTAACAGG 
AACAAAGTCC 
ATGGAGACAC 
TCATAGTTGG 
GAAAAATGTC 
GTGCTTTAAA 
GATGACCCTG 
ATTAACCGAG 



MWKVSALLFV LGSASLWVLA EGASTGQPED DTETTGLEGG VAMPGAEDDV VTPGTSEDRY 
KSGLTTLVAT SVNSVTGIRI EDLPTSESTV HAQEQSPSAT ASNVATSHST EKVDGDTQTT 
VEKDGLSTVT LVGIIVGVLL AIGFIGGIIV V 

Seq ID NO: 504 DNA sequence . 
Nucleic Acid Accession #: Eo 
Coding sequence: 52.. 895 
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TGTCTGTGCT GCTGGATGGA TGGCTAAGGG CAGAGTTGGA TACCCCATTG TGAAGCCAGG 3S0 

GCCCAACTG1 GGATTTGGAA AAACTGGCAT TATTGATTAT GGAATCCGTC TCAATAGGAG 42 0 

TGAAAGATGG GATGCCTATT GCTACAACCC ACACGCAAAG GAGTGTGGTG GCGTCTTTAC 480 
AGATCCAAAG CAAATTTTTA AATCTCCAGG CTTCCCAAAT GAGTACGAAG A 
CTGCTACTGG CACATTAGAC TCAAGTATGG TCAGCGTATT CACCTGJ 



TACTACTTCT ACTGGAAATA AAAACTTTTT AGCTGGAAGA T T TAGCCACT T 

AAAAAAAGGA TGATCAAAAC ACACAGTGTT TATGTTGGAA TCTTTTGGAA CTCCTTTGAT 960 

CTCACTGTTA TTATTAACAT TTATTTATTA TTTTTCTAAA TGTGAAAGCA ATACATAATT 102 0 

TAGGGAAAAT TGGAAAATAT AGGAAACTTT AAACGAGAAA ATGAAACCTC TCATAATCCC 10 80 

ACTGCATAGA AATAAC AAG C GTTAACATTT TCATATTTTT TTCTTTCAGT CATTTTTCTA 1140 

TTTGTGGTAT ATGTATATAT GTACCTATAT GTATTTGCAT TTGAAATTTT GGAATCCTGC 1200 

TCTATGTACA GTTTTGTATT ATACTTTTTA AATCTTGAAC TTTAIAAACA TTTTCTGAAA 1260 

TCATTGATTA TTCTACAAAA ACATGATTTT AAACAGCTGT AAAATATTCT ATGATATGAA 132 0 

TGTTTTATGC ATTATTTAAG CCTGTCTCTA TTGTTGGAAT TTCAGGTCAT TTTCATAAAT 1380 
ATTGTTGCAA TAAATATCCT TGAACACACA AAAAAAAAAA AA 

Seq ID N 

1 11 21 31 41 51 

I I I 1 I I 

MIILIYLFLL LWEDTQGWGP KDGIFHNSIW LERAAGVYHR EARSGKYKLT YAEAKAVCEF 
EGGHLATYKQ LEAARKIGFH VCAAGWMAKG HVGYPIVKPG PNCGPGKTGI IDYGIRLNRS 
ERWDAYCYNP HAKECGGVFT DPKQIFKSPG FPNEYEDNQI CYWHIRLKYG QRIHLSFLDF 
DLEDDPGCLA DYVEIYDSYD DVHGFVGRYC GDEIiPDDIIS TGNVMTLKFL SDASVTAGGF 
QIKYVAMDPV SKSSQGKNTS TTSTGNKNFL AGRPSHL 

Seq ID NO: 50 6 DNA sequence 
Nucleic Acid Accession #: NM_007115.1 
Coding sequence: 69.. 902 



1 11 21 31 41 51 

I I I I I 

GAATTCGCAC TGCTCTGAGA ATTTGTGAGC AGCCCCTAAC AGGCTGTTAC TTCACTACAA 60 

CTGACGATAT GATCATCTTA ATTTACTTAT TTCTCTTGCT ATGGGAAGAC ACTCAAGGAT 12 0 

40 GGGGATTCAA GGATGGAATT TTTCATAACT CCATATGGCT TGAACGAGCA GCCGGTGTGT 180 

ACCACAGAGA AGCACGGTCT GGCAAATACA AGCTCACCTA CGCAGAAGCT AAGGCGGTGT 24 0 

GTGAATTTGA AGGCGGCCAT CTCGCAACTT ACAAGCAGCT AGAGGCAGCC AGAAAAATTG 300 

GATTTCATGT CTGTGCTGCT GGATGGATGG CTAAGGGCAG AGTTGGATAC CCCATTGTGA 360 

AGCCAGGGCC CAACTGATGA TTTGGAAAAA CTGGCATTAT TGATTATGGA ATCCGTCTCA 42 0 

45 ATAGGAGTGA AAGATGGGAT GCCTATTGCT ACAACCCACA CGCAAAGGAG TGTGGTGGCG 480 

TCTTTACAGA TCCAAAGCGA ATTTTTAAAT CTCCAGGCTT CCCAAATGAG TACGAAGATA 540 

ACCAAATCTG CTACTGGCAC ATTAGACTCA AGTATGGTCA GCGTATTCAC CTGAGTTTTT 60 0 

TAGATTTTGA CCTTGAAGAT GACCCAGGTT GCTTGGCTGA TTATGTTGAA ATATATGACA 660 

GTTACGATGA TGTCCATGGC TTTGTGGGAA GATACTGTGG AGATGAGCTT CCAGATGACA 720 

50 TCATCAGTAC AGGAAATGTC ATGACCTTGA AGTTTCTAAG TGATGCTTCA GTGACAGCTG 780 

GAGGTTTCCA AATCAAATAT GTTGCAATGG ATCCTGTATC CAAATCCAGT CAAGGAAAAA 840 

ATACAAGTAC TACTTCTACT GGAAATAAAA ACTTTTTAGC TGGAAGATTT AGCCACTTAT 900 

AAAAAAAAAA AAGGATGATC AAAACACACA GTGTTTATGT TGCAATCTTT TGGAACTCCT 960 

TTGATCTCAC TGTTATTATT AACATTTATI TATTATTTTT CTAAATGTGA AAGAAATACA 102 0 

55 TAATTTAGGG AAAATTGGAA AATATAGGAA ACTTTAAACG AGAAAATGAA ACCTCTCATA 108 0 

ATCCCACTGC ATAGAAATAA CAAGCGTTAA CATTTTCATA TTTTTTTCTT TCAGTCATTT 1140 

TTGTATTTGT GGTATATGTA TATATGTACC TATATGTATT TGCATTTGAA ATTTTGGAAT 12 00 

„ CCTGCTCTAT GTACAGTTTT GTATTATACT TTTTAAATCT TGAACTTTAT GAACATTTTC 1260 

TGAAATCATT GATTATTCTA CAAAAACATG ATTTTAAACA GCTGTAAAAT ATTCTATGAT 1320 

60 ATGAATGTTT TATGCATTAT TTAAGCCTGT CTCTATTGTT GGAATTTCAG GTCATTTTCA 1380 
TAAATATTGT TGCAATAAAT ATCCTTCGGA ATTC 



MIILIYLFLL LWEDTQGWGF KDGIFHNSIW LERAAGVYHR EARSGKYKLT YAEAKAVCEF 
EGGHLATYKQ LEAARKIGFH VCAAGWMAKG RVGYPIVKPG PNXXFGKTGI IDYGIRLNRS 
ERWDAYCYNP HAKECGGVFT DPKRIFKSPG FPNEYEDNQI CYWHIRLKYG QRIHLSFLDF 
DLEDDPGCLA DYVEIYDSYD DVHGFVGRYC GDELPDDIIS TGNVMTLKFL SDASVTAGGF 
QIKYVAMDPV SKSSQGKNTS TTSTGNKNFL AGRFSHL 

Seq ID NO: 508 DNA sequence 
Coding sequence: 129.. 1991 



r CGGCGCCAGG 

AAAGCCCAGG CCCGGGCGGC CAGACCAAGA GGGAAGAAGC ACAGAATTCC TCAACTCCCA 

GTGTGCCCAT G AGTAAG AG C AAATGCTCCG TGGGACTCAT GTCTTCCGTG GTGGCCCCGG 

CTAAGGAGCC CAATGCCGTG GGCCCGAAGG AGGTGGAGCT CATCCTTGTC AAGGAGCAGA 



378 



20 
25 



WO 02/086443 

TCCTGGTCCC CTACCTGCTC T 
TGGCCCTCGG CCAGTTCAAC A 
TGAAAGGTGT GGGCTTCACG G 



55 
60 
65 



TCCACTGCAA CAACTCCTGG 
GTGGAGACAG CTCGGGCCTC 
AACGTGGCGT GCIGCACCTC 
GGCAGCTCAC AGCCTGCCTG 



AACAGCCCCA 



CTGCCCTGCT CCTGCGTGGG 
TGAGCGTTGA CTTCTACCGG 
TGTGCTTCTC CC1GGGCGTG 



CACCAGAGCC 
GTGCTGGTCA 
GTGGTATGGA 



CTCTGCGAGG 



A TTGCTGGGAT 
G CCGCTGGTGT 
A TCTCACTGTA 
TCTCCTCCTT 
ACTGCTCGGA 
TTGGGACCAC 
ATGGCATCGA 
TCGTGCTGCT 
TCACAGCCAC 
CTGGAGCCAT 



GCCACTTTTC 
TGTCGGCTTC 



PCT/US02/12476 



ACCTGCTGCC 
CGACCTGGGG 
CTACTTCAGC 



TGCCCCATAC 
TTCTACAACG 
CTCCCCTGGA 
GGTGACTCCA 
GAGTACTTTG 




GATTGACGCG G 

CATCAACTCC C 
GGCACAGAAG C 



TCCAGCTGCT G 



TTGCAGCCGG 
TCTATGGTGT 
TGTACTGGCG 
TCAGCATTGT 



CACGTCCATC 



CTCTTTGGAG TGCTCATCGA AGCCATCGGA 
AGCGACGACA TCCAGCAGAT GACCGGGCAG 
AAGCTGGTCA GCCCCTGCTT TCTCCTGTTC 



ATCGATGAGT 
GCGACCTTCC 
CTGGACCATT 



CGGCCCAGCC 



GCTGTGCTGG A 
GACCTTCAGA C 

CTGGGTCATC GCCACATCCT CCATGGCCAT GGTGCCCATC TATGCGGCCT 
CAGCCTGCCT G 



AGGTGCGCCA G' 



TCTGIGTGAG 
CACTGTGTTC 
CTCACAGTAG 
GGTTTAGCTG 
GATGCGTGGC 
GTCTGTTCAG 
ATGTTCTTGC 



GGGTCCTTTC 
GACAGAGGGG 
GAGGGAGCAG AGACGAAGAC 
CAAGGAAATC TAAGTTTCGA 
AACACAAACA ACAAAGCAGA 
GAGCGCACCT CGCCGTGTCT 
CCACCCCGTT GTTGTCCCTG CAGGGCAGAA 
TCCCTGCTCC 
ATCACGATCC 
CATTTACTTT 



GAGAGAAACT GGCCTACGCC ATTGCACCCG 



CTTCATGCTG 



CTTCCTAGAC 



GACGCATGCA GGGCCCCCAC 



GGGTCCTTGT GGTGTAGGGA 



TTGTAGACGC 
GCCCATATTA 
AAACCACAAA 
TGAGCGTTCA 
CCTGGTATGT 
CTTGAAACCA 
ACGGCCTGAG 
CCTATCCCCG 



GGCAACTTCT ACTCTTCAAC 
CTICTGACTG 
TAATAACGAC 
AAACGTCTAA 
CTCTGAGGCT 
ACCTGCTGAG AATCCCCGTG 
AAAAGCCAAG TGTCCTGCTT 
GTCCTTTCCC 
TGCACACACA 
AATTCTGTTT 



GTTGACACAT 
CTCACCAGGA 
GCTCAGGCTA 
AGGAGCGTGT 



AACGCATGCA 



ACTACCCCAG 
ATGCAGGGCC 



AGGAGCGTGT A 



CCCATGCAGG C 



AGGAGCGTGT 
GACGCATGCA 
AGGAGCGTGT 
CCCAGGACGC 
CCCACAGGAG 
ACCAACACTC 
TTTCTCTCAG 



CATCAATAAC 
CACACTGCCC 



GCAGTATCCG C 



I TGCTGATATT 



CCTATCCCCG 
GGGCCCCCAC 
ACTACCCCAG 
GGGCCCCCAC 
CCTATCCCCG 
ATGCAGGGCC 
CGTGTACTAC 
TGCCTGGCCT 
GTGCGTGCCA 
TTACCTGTGA 
GCAGTTTTTG 
CTTTCCATGG 



2040 
2100 
2160 

22B0 

2400 
2460 

2580 
2640 
2700 



GGGAGGGACA 

GTTGTTGAAG 
AGCAACCCAG 
TAAGCACAAT 



TCTGCCACTG 
TGCCTACGTG CTGCCCGAGG 
GTGGACGTGG 
TGCCAGGCAG 
CAGAGGACGG CTTCCCCATC 
CATTGCCTTC TGGGGAGGGA 
ACAGCACAGA GAGCGGCTTC 
GTGTTGTCCG TGTCTGTTGA 
AAAAGACATC CACAATGGAA 



GCCTTCTGGC 



CCCATCGCCT 
CCAATCTCTA 
AAAAAAAAAG 



TGCAGGGCCA 
ATGCTCGGTG 
GGCTTCCCCA 
CGCTGCAGTC 
AGTTTCCCCA 



AGCACAGAGA 



MSKSKCSVGL MSSWAPAKE 
ETWGKKIDPL LSVIGFAVDL 
GQFNREGAAG VWKICPILKG 



RELVDRGEVR QFTLRHWLKV 



21 
I 

" PNAVGPKEVE 
ANVWRFPYLC 
VGPTVILISL 
SSGLNDTFGT 
TSGKWWITA 
SLGVGFGVLI 
DVAKDGPGLI 
LHRHRELFTL 
VGQFSDDIQQ 
GWVIATSSMA 



31 

I 

LI LVKEQNGV 
YKMGGGAFLV 
YVGFFYNVII 
TPAAEYFERG 
TMPYWLTAL 
AFSSYNKFTN 
FIIYPEAIAT 
FIVLATFLLS 
MTGQRPSLYW 
MVPIYAAYKF 



QLTSSTLTNP 



NCYRDAIVTT 
LPLSSAWAW 
LFCVTNGGIY 
RLCKKLVSPC 



51 
I 

RCSPVEAQDR 
MPLFYMELAb 
FTTELPWIHC 
DDLGPPRWQL 
IDGIRAYLSV 
SINSLTSFSS 
FFIMLLTLGI 
VFTLLDHFAA 
FLLFWWSI 
LAYAIAPEKD 



Coding sequence: 43., 1422 



GCCCGTACAC ACCGTGTGCT GGGACACCCC ACAGTCAGCC GCATGGCTCC CCTGTGCCCC 
AGCCCCTGGC TCCCTCTGTT GATCCCGGCC CCTGCTCCAG GCCTCACTGT GCAACTGCTG 
CTGTCACTGC TGCTTCTGAT GCCTGTCCAT CCCCAGAGGT TGCCCCGGAT GCAGGAGGAT 
TCCCCCTTGG GAGGAGGCTC TTCTGGGGAA GATGACCCAC TGC-GCGAGGA GGATCTGCCC 



379 
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AGTGAAGAGG ATTCACCCAG AGAGGAGGAT CCACCCGGAG AGGAGGATCT ACCTGGAGAG 300 

GAGGATCTAC CTGGAGAGGA GGATCTACCT GAAGTTAAGC CTAAATCAGA AGAAGAGGGC 360 

TCCCIGAAGT TAGAGGATCT ACCTACTGTT GAGGCTCCTG GAGATCCICA AGAACCCCAG 420 

AATAATGCCC ACAGGGACAA AGAAGGGGAT GACCAGAGTC ATTC-GCGCTA TGGAGGCGAC 480 

CCGCCCTGGC CCCGGGTGTC CCCAGCCTGC GCGGGCCGCT TCCAGTCCCC GGTGGATATC 540 

CGCCCCCAGC TCGCCGCCTT CTGCCCGGCC CTGCGCCCCC TGGAACTCCT GGGCTTCCAG 500 

CTCCCGCCGC TCCCAGAACT GCGCCTGCGC AACAATGGCC ACAGTGTGCA ACTGACCCTG 66 0 

CCTCCTGGGC TAGAGATGGC TCTGGGTCCC GGGCGGGAGT ACCGGGCTCT GCAGCTGCAT 720 

CTGCACTGGG GGGCTGCAGG TCGTCCGGGC TCGGAGCACA CTGTGGAAGG CCACCGTTTC 780 

CCTGCCGAGA TCCACGTGGT TCACCTCAGC ACCGCCTTTG CCAGAGTTGA CGAGGCCTTG 84 0 

GGGCGCCCGG GAGGCCTGGC CGTGTTGGCC GCCTTTCTGG AGGAGGGCCC GGAAGAAAAC 900 

AGTGCCTATG AGCAGTTGCT GTCTCGCTTG GAAGAAATCG CTGAGGAAGG CTCAGAGACT 960 

CAGGTCCCAG GACTGGACAT ATCTGCACTC CTGCCCTCTG ACTTCAGCCG CTACTTCCAA 1020 

TATGAGGGGT CTCTGACTAC ACCGCCCTGT GCCCAGGGTG TCATCTGGAC TGTGTTTAAC 108 0 

CAGACAGTGA TGCTGAGTGC TAAGCAGCTC CACACCCTCT CTGACACCCT GTGGGGACCT 1140 

GGTGACTCTC GGCTACAGCT GAACTTCCGA GCGACGCAGC CTTTGAATGG GCGAGTGATT 1200 

GAGGCCTCCT TCCCTGCTGG AGTGGACAGC AGTCCTCGGG CTGCTGAGCC AGTCCAGCTG 1260 

AATTCCTGCC TGGCTGCTGG TGACATCCTA GCCCTGGTTT TTGGCCTCCT TTTTGCTGTC 132 0 

ACCAGCGTCG CGTTCCTTGT GCAGATGAGA AGGCAGCACA GAAGGGGAAC CAAAGGGGGT 1380 

GTGAGCTACC GCCCAGCAGA GGTAGCCGAG ACTGGAGCCT AGAGGCIGGA ICTTGGAGAA 144 0 

TGTGAGAAGC CAGCCAGAGG CATCTGAGGG GGAGCCGGTA A .-.,,.,-,,..,.„ -„„ 

ATGCCACTTC CTTTTAACTG CCAAGAAATT TTTTAAAATA A 



1 11 21 31 41 SI 

r i i i i i 

MAPLCPSPWL PLLIPAPAPG LTVQLLLSLL LLMPVHPQRL PRMQEDSPLG GGSSGEDDPL 
30 GEEDLPSEED SPREEDPPGE EDLPGEEDLP GEEDLPEVKP KSEEEGSLKL EDLPTVEAPG 
DPQEPQNNAH RDKEGDDQSH WRYGGDPPWP RVSPACAGRF QSPVDIRPQL AAFCPALRPL 
ELLGFQLPPL PELRLRNNGH SVQLTLPPGL EMALGPGREY RALQLHLHWG AAGRPGSEHT 
VEGHRFPAEI HWHLSTAFA RVDEALGRPG GLAVLAAFLE EGPEENSAYE QLLSRLEEIA 
EEGSETQVPG LDISALLPSD FSRYFQYEGS LTTPPCAQGV IWTVFNQTVM LSAKQLHTLS 
35 DTLWGPGDSR LQLNFRATQP LNGRVIEASF PAGVDSSPRA AEPVQLNSCL AAGDILALVF 
GLLFAVTSVA FLVQMRRQHR RGTKGGVSYR PAEVAETGA 

Seq ID NO: 512 DNA sequence 
Nucleic Acid Accession It: Eos sequence 
40 coding sequence! 1. .3 978 

1 11 21 31 41 51 

I I I I I I 

ATGGTGGGTG AAGGACCCTA CCTTATCTCA GATCTGGACC AGCGAGGCCG GCGGAGATCC 
45 TTTGCAGAAA GATATGACCC CAGCCTGAAG ACCATGATCC CAGTGCGACC CTGTGCAAGG 

TTAGCACCCA ACCCGGTGGA TGATGCCGGG CTACTCTCCT TCGCCACATT TTCCTGGCTC 
ACGCCGGTGA TGGTGAAAGG CTACCGGCAA AGGCTGACCG TAGACACCCT GCCCCCATTG 
TCGACATATG ACTCATCTGA CACCAATGCC AAAAGATTTC GAGTCCTTTG GGATGAAGAG 
GTAGCAAGGG TGGGTCCTGA GAAGGCCTCT CTGAGCCACG TGGTGTGGAA ATTCCAGAGG 
50 ACACGCGTGT TGATGGACAT CGTGGCCAAC ATCCTGTGCA TCATCATGGC AGCCATAGGG 

CCGACAGTTC TCATTCACCA AATCCTCCAG CAGACTGAGA GGACCTCTGG GAAAGTCTGG 
3 GACTGTGCAT AGCCCTTTTT GCCACCGAGT TTACCAAAGT CTTCTTTTGG 
A CTACCGCACG GCCATCCGGT TGAAGGTGGC GCTCTCCACC 
T GTCCTTCAAG ACATTGACCC ACATCICTGT TGGCGAGGTG 



CTCAATATAC TGTCAAGTGA TAGCTATTCT TTGTTTGAAG CTGCCTTC 

CCAGCCACCA TCCCGATCCT AATGGTCTTT TGTGCGGCGT ACGCCTTTTT CATTCTGGGG 780 

CCCACAGCTC TCATCGGGAT ATCAGTGTAT GTCATATTCA TACCCGTCCA GATGTTTATG 84 0 

GCCAAGCTCA ATTCAGCTTT GCGAAGGTCA GCAATTTTGG IGACAGACAA GCGAGTTCAG 90 0 

ACAATGAATG AGTTTCTGAC CTGCATCAGG CTGATCAAAA TCTATGCCTG GGAGAAATCT 960 

TTTACCAACA CTATCCAAGA TATAAGAAGG AGGGAAAGAA AATTACTGGA AAAAGCTGGA 1020 

TTTGTCCAAA GTGGAAACTC TGCCCTGGCC CCCATCGTGT CCACCATAGC CATCGTGCTG 10 80 

ACATTA1CCT GCCACATCCT CCTGAGACGC AAACTCACCG CACCCGTGGC ATTTAGTGTG 1140 

ATTGCCATGT TTAATGTAAT GAAGTTTTCC ATTGCAATCT TGCCCTTCTC CATCAAAGCA 12 00 

ATGGCTGAAG CGAATGTCTC TCTAAGGAGA ATGAAGAAAA TTCTCATAGA TAAAAGCCCC 1260 

CCATCT1ACA TCACCCAACC AGAAGACCCA GATACTGTCT TGCTTTTAGC AAATGCCACC 132 0 

TTGACATGGG AGCATGAAGC CAGCAGGAAA AGTACCCCAA AGAAATTGCA GAACCAGAAA 1380 

AGGCATTTAT GCAAGAAACA GAGGTCAGAG GCATACAGTG AGAGGAGTCC ACCAGCCAAG 1440 

GGAGCCACTG GCCCAGAGGA GCAAAGTGAC AGCCTCAAAT CGGTTCTGCA CAGCATAAGC 15 0 0 

TTTGTGGTGA GAAAGTTATG TCGTTATCCC GAAGCCCAGC TCCTGGCTTG GAOGTGGCCA 1560 

GCAGTGTTTG TTGGGAGAAT CAT CAG AGGA TACAGGCCTC ATGGATTTTC TGCTAAAGAC 162 0 

AAGGATGAAT CTAGAAGGCT TCTTACTTGG CCCCAAGAAG TGGATAGGAC TCAAAGGGCA 1680 

GCCAAATACC TGGGGAAGAT CTTGGGAATA TGTGGGAATG TGGGAAGTGG AAAGAGCTCC 1740 

CTCCTTGCAG CTCTCCTAGG ACAGATGCAG CTGCAGAAAG GGGT3GTGGC AGTCAATGGA 1800 

ACTTTGGCCT ACGTTTCACA GCAGGCATGG ATCTTTCATG GAAATGTGAG AGAAAACATA 1860 

CTCTTTGGAG AAAAGTATGA TCACCAAAGG TATCAGCACA CAGTCCGCGT CTGTGGCCTC 192 0 

CAGAAGGACC TGAGCAACCT CCCCTATGGA GACCTGACTG AGATTGGGGA GCGGGGCCTC 1980 

AACCTCTCTG GGGGGCAGAG GCAGAGGATT AGCCTGGCCC GCGCIGTCIA CTCCGACCGT 2 040 

CAGCTCTACC TGCTGGACGA CCCCCTGTCG GCCGTGGACG CCCACGTGGG GAAGCACGTC 2100 

TTTGAGGAGT GCATTAAGAA GACGCTCAGG GGAAAGACAG TCGTCCTGGT GACCCACCAG 2160 

CTACAGTTCT TAGAGTCTTG TGATGAAGTT ATTTTATTAG AAGATGGAGA GATTTGIGAA 2220 

AAGGGAACCC ACAAGGAGTT AATGGAGGAG AGAGGGCGCT ATGCAAAACT GATTCACAAC 22 80 

CTGCGAGGAT TGCAGTTCAA GGATCCTGAA CACCTTTACA ATGCAGCAAT GGTGGAAGCC 2340 

TTCAAGGAGA GCCCTGCTGA GAGAGAGGAA GATGCTGGTA TAATCGGGIA CCTCCTTTCT 24 00 

CTCTTCACTG TGTTCCTCTT CCTCCTGATG ATTGGCAGCG CTGCCTTCAG CAACTGGTGG 2460 

CTGGGTCTCT GGTTGGACAA GGGCTCACGG ATGACCTGTG GGCCCCAGGG CAACAGGACC 2 520 

ATGTGTGAGG TCGGCGCGGT GCTGGCAGAC ATCGGTCAGC ATGIGTACCA GIGGGTGTAC 2 5 80 

ACTGCAAGCA TGGTGTTCAT GCTGGTGTTT GGCGTCACCA AAGGCTTCGT CTTCACCAAG 2 640 



380 
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ACCACACTGA TGGCATCCTC CTCTCTGCAT 
CCAATGAGTT TCTTTGACAC GACTCCCACT 
ATGGACGAGC TGGATGTGAG GCTGCCGTTT 
ATGGTGGTGT TTATTCTCGT GATCTTGGCT 
GCCAGCCTTG CTGTAGGCTT CTTCATTCTG 
CTCAAGAAGG TGGAGAATGT CAGCCGGTCA 
CAGGGCCTGG GCATCATTCA CGCCTATGGC 
TCCAAAGGCC TGTCATTGTC ATACATCATC 
CGAACGGGAA CAGAGACGCA AGCCAAATTC 
TCGACCTGTG TTCCTGAATG CACTCATCCC 
CCCAGCTGTG GGGAGATCAC CTTCAGAGAC 



GACACGGTGT TTGATAAGAT 
GGCAGGCTAA TGAACCGTTT 
CACGCAGAGA ACTTTCTGCA 
GCTGTGTTTC CTGCTGTCCT 
TTACGCATTT TCCACAGAGG 
CCCTGGTTCA CCCACATCAC 
AAGAAGGAGA GCTGCATCAC 
CAGCTGAGCG GACTGCTCCA 
ACCTCCGTGG 



CTTAAAGAGC 
TTCCAAGGAT 
GCAGTTTTTT 
TTTAGTCGTG 
AGTCCAGGAG 
CTCCTCCATG 



PCT/US02/12476 



CTATTGA IGAGGTGGAT ATCTGCATTC TCAGCTTGGA AGACCTCAGA 
ACCAAGCTGA CTGIGATCCC ACAGGATCCT GTCCTGTTTG TAGGTACAGT AAGGTACAAC 
TTGGAICCCT TTGAGAGTCA CACCGATGAG ATGCTCTGGC AGGTTCTGGA GAGAACATTC 
ATGAGAGACA CAATAATGAA ACTCCCAGAA AAATTACAGG CAGAAGTCAC AGAAAATGGA 
GAAAACTTCT CAGTAGGGGA ACGTCAGCTG CTTTGTGTGG CCCGAGCTCT TCTCCGTAAT 
TCAAAGATCA TTCTCCTTGA TGAAGCCACC GCCTCTATGG ACTCCAAGAC TGACACCCTG 
GTTCAGAACA CCATCAAAGA TGCCTTCAAG GGCTGCACTG TGCTGACCAT CGCCCACCGC 
CTCAACACAG TTCTCAACTG CGATCACGTC CTGGTTATGG AAAATGGGAA GGTGATTGAG 
TTTGACAAGC CTGAAGTCCT TGCAGAGAAG CCAGATTCTG CATTTGCGAT GTTACTAGCA 
GCAGAAGTCA GATTGTAG 

Seq ID NO: 513 P 



itein sequence 
ft: Eos sequence 



I 

MVGEGPYLIS 
TPVMVKGYRQ 
TRVLMDIVAN 
ALAWAINYRT 
PATIPILMVF 
TMNEFLTCIR 
TLSCHILLRR 
PSYITQPEDP 



RLTVDTLPPL 



TLAYVSQQAW 
NLSGGQRQRI 
LQFLESCDEV 



AIRLKVALST 
CAAYAFFILG 
LIKMYAWEKS 
KLTAPVAFSV 
DTVLLLANAT 
SLKSVLHSIS 
PQEVDRTQRA 
IFHGNVRENI 



I I I 

' FAERYDPSLK TMIPVRPCAR LAPNPVDDAG 
STYDSSDTNA KRFRVLWDEE VARVGPEKAS 
PTVLIHQILQ QTERTSGKVW VGIGLCIALF 
LVFENLVSFK TLTHISVGEV LNILSSDSYS 
PTALIGISVY VIFIPVQMFM AKLNSAFRRS 
FTNTIQDIRR RERKLLEKAG FVQSGNSALA 
IAMFNVMKFS IAILPFSIKA MAEAJR 



LLSFATFSWj 



ATEFTKVFFW 
LFEAALFCPL 
AILVTDKRVQ 
PIVSTIAIVL 



ILIDKSP 420 



LTWEHEASRK STPKKLQNQK RHLCKKQRSE AYSERSPPAK 480 



ILLEDGEICE 
DAGIIGYLLS 
IGQHVYQWVY 
GRLMNRFSKD 
LRIFHRGVOE 
QLSGLLQVCV 
YQMRYRDNTP 
ICILSLEDLR 
KLQAEVTENG 
GCTVLTIAHR 



SKGLSLSYII 
PSCGEITFRD 
SGTIFIDEVD 
MRDTIMKLPE 
VQNTIKDAFK 
AEVRL 

Seq ID NO: 514 DNA sequence 
Nucleic Acid Accession #: Z31560 
Coding sequence: 1-966 



AKYLGKILGI 
LFGEKYDHQR 
QLYLLDDPLS 
KGTHKELMEE 
LFTVFLFLLM 
TASMVFMLVF 
MDELDVRLPF 
LKKVENVSRS 
RTGTETQAKF 
LVLDSLNLNI 
TKLTVIPQDP 
ENFSVGERQL 
LNTVLNCDHV 



YQHTVRVCGL 
AUDAHVGKHV 
RGRYAKLIHN 
IC-SAAFSNWW 



LLAALLGQMQ 
QKDLSNLPYG 
FEECIKKTLR 
LRGLQFKDPE 



TTLKASSSLH 



LQKGWAVNG 
DLTEIGERGL 
GKTWLVTHQ 
HLYNAAMVEA 
MICGPQCNRT 
DIVFDKILKS 
HAENFLQQFF HWFILVILA AVFPAVLLW 
PWFTHITSSM QGLGIIHAYG KKESCITYTS 
TSVELLREYI STCVPECTEP LKVGTCPKDW 
QSGQTVGIVG RTGSGKSSLG MALFRLVEPA 
VLFVGrVEYN LDPFESHTDE MLWQVLERTF 
LCVARALLRN 
LVMENGKVIE 



CACAGCGCCC GCATGTACAA C 



AGCCCGGACC GCGTCAAGCG 



CTGCGAGCGC 
AAGACGCTCA 
AATAGCATGG 
ATGGACAGTT 



AACTTTTGTC 
TGCACATGAA 
TGAAGAAGGA 
CGAGCGGGGT 
ACGCGCACAT 
ACCCGCAGCA 
ACGACGTGAG 
CGCCCACCTA 
TGGGTTCGGT 




3 TACAACTCCA 
C TACTCGCAGC 
C GAGGCCAGCT 



G GCCATTAACG 
3 AGAGTAAGAA ACAGCATGGA 



TCCGGGACAT 
GCAGACTTCA 
GCACACTGCC 
TCAAAGAAAA 
GAAAACCCGG 



TGGCATGGCT 
TGTGGnACC 
GATCAGCATG 
CATGTCCCAG 
CCTCTCACAC 
ACGAGGGAAA 
TACGCTCAAA 



KTLMKKDKYT LPGGLLAPGG NSMASGVGVG AGLGAGVNQR HDSYAHMNGW SNGSYSMMQD 180 



381 
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QLGYPQHPGL NAHGAAQMQP MHRYDVSALQ YNSHTSSQTY MNGSPTYSMS YSQQGTPGMA 
LGSMGSWKS EASSSPPWT SSSHSRAPCQ AGDLRDMISM YLPGAEVPEP AAPSRLHMSQ 
HYQSGPVPGT AINGTLPLSH M 
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tgcttt; 
tcacagcagg 
tgacaaaaat 

GCTGTATGAG 
AGAGAATAAA 
ATTATATTTG 
ATTGAATGTG 
TCTTCAAAAA 

Seq ID NO: 



AATAAACCCA 



ACATGTGATT 
TGACAAACAC 
CTAATAGAAA 
AAATGGGGCC 



I I I I I I 

MMAGMKIQLV CMLLLAFSSW SLCSDSEEEM KALEADFLTN MHTSKISKAH VPSWKMTLLN 

VCSLVNNLNS PAEETGEVHE EELVARRKLP TALDGFSLEA MLTIYQLHKI CHSRAFQHWE 

LIQEDILDTG NDKNGKEEVI KRKIPYILKR QLYENKPRRP YILKRDSYYY 

Seq ID NO: Sla DNA sequence 

Nucleic Acid Accession #i NM_006S36.2 

Coding sequence: 109. .2940 



I 

ACCTAAAACC 
ATGTATGCAG 
AGCATTGCAG 
GAACTCCCAT 
ATTGCAATTA 
ATAACTGAAG 



11 
I 

" TTGCAAGTTC 
CAGGCTCAGT 
GTCCTATTTG 
TCCTGGGAGC 
ATCCTCAGGT 
CTTCATTTTA 



AGGAAGAAAC 
GTGAGTGAAC 
CAACCTGAAG 
TGGAG7ACAG 



TCATATGAAA 
TACACCCTAC 
TTCCTACTGA 
GAATGGGCCC 
ATAAATCGGC 
GTGTGTGAAA 



AGGCAAATGT 
AATACAGAG G 
ATGATAACTT 



AAGGTCCTTG C 



CATCTGCATC 
TTTGTGACTC 
CAGAACCTCA 
GCTAATAATA 



ATGGGTATAA 



GAAGAGTAT7 
ACAGCAAAAT 
GGGCACATGG 
ACATTCATTT 



AGTTTATCTT 
CTACAGAACC 



GTACAGGCTG 
GCTGACAGAC 
ATTCATACCT 
CACCAAATTA 
TCAGCTAAAA 
AAACTGAATG 
CTTCTTGGCA 
CTGGGTTCAT 



AGATGTGCAG 
GCTTTCCCAT 
GTGACAAAGT 
TCCTTCAACT 



TCTGGAACTG 
AAACCTCACC 
ATGTTTCTAG 
GGACGAAAAT 
TGGATTCCAG 
TCTCTGCAAG 



ACAGCAATGA 
C AG ACAT C AG 
GAAAAGCTTA 
ATTGCTTACC 
CTGCAGCCCC 
C CAGATATATC 



ATTTTGTAAT 
CCTCAGAAGT 
GAATGGGACT 
GGTCTGTTTA 
ACAACAAGCC 
TGCCAGTTTC 
TG AT CGAAAG 
CATTTGTTCA 
TGGCTCTGTG 
CACTGTGCTC 
AAATCTGGAG 
AAACTCCAAT 



ACAATGACAA 
CTGACATCAC 
TTAGTAAGCT 
CATCAATAAT 
GCAAGTACCC ACAACCAAGA 
GCATGGGATG TAATCACAGA 
GAGCTTCCAC 
GTGCTGGATG 
GCAGAATTTT 
GACAGCAAAG 

GGGCTTAAGA 
ATGATATTAG 
AGCAGTGGTT 
GAATTATCAC 
AGCATGATTG ATGCT7TCAG 



GACCCAAAGG 
CTTAAGTTCA 
TGGATTGCTC 
TAAGGAAATG 
TTTCAGAAAT 
AAAACAAGAA 
AGATGATCCA 
CACACCTAAT 
GTT7GTCCAT 
ACCTTTCTAC 
AGGCATTTTT 
TTTTAAAGAA 



TGTCCAGCAA 
ATTTGATGCA 
GAGAGATCAG 
CATATCTGCC 
AAGGATTTGA 
7GACCAGCGG 
CAACAATTCA 



AGCACCAAAC 
CTCTGCTGAC 
ATTCTCGCTT 



GATTGTTGAA 



ATCAATTGAA 
TTACGTGGCA 
ACTACACAAA 
GAACAGCTAA 
CCCTGAAAGT 



AAACACAGTG 



TAATTTTATC 
GCCTGGGCAC 
GACAGTGACC 



ACTGTGGATA 
CCTCCTGAGA 
ACCAATCTAA 
TGGACTTACA 
TCTCGCGCCT 



TATGCCAATG 



TATAGCTTGA 
CCAGGGAGTC 
GCTCCAAGGA 
AGCTCAGGAG 
CCACCATGCA 



CTGGAGATCC 
ATGATGGAAT 
AAGTGCATGT 
ATGCTATGTA 
AATCAGTAGG 
GCTCCTTTTC 
AAATTATTGA 



ATTTTATCCC A' 



ATACTGTGGG 
TTATATTATT 
CTTTTCGGAC 
CCCTGAACAA 
CCAACTCAGC 
7TCCTCATCC 



CACCACTGTA 
GGTGGTTGAA 
AGATGATAAG 
CTCCATTGCC 
AGGTTTAAAG 
TAGAATTTCC 
TGAAAATGTC 
CAACGACACT 
TGA7CCTGAT 
AGCTAGTCTT 
TACCCATCAT 
TGTGCCCCCA 
TGTGATGATT 



AGTAAAAGTC 



CTCAGCAAGC 
CTGAACATCA 
CAATGGATAG 
TTCCCCCCAA 
CAGCAATGGG 



TTACTCGAGG 
CAATCACTCT 
TGTACCAGGT 
CAGAAATGAG 
AGTGCTGGGA 
CCTGGAAGCT 
CTTTGATCAG 
CCAAGATGAC 
TGGCATCAGG 
GCCAAATGGA 
GAACTCCTTA 
TTCTGATCCT 
TTTGATAGGA 



CCCAGCATAA 
GAGGAGCGAA 



ATGATGGAGC AGGTGCTGAT 2100 
CCTTTGCTGC A 
GCACCCCAGC C 
ACGGTAATAT T 
AGTGGGGCT7 I 
GCCCCCACCC T 
AAGAGGAATT GACCCTATCT 
CAAGCTATGA AATAAGAATG 



C TGA7GTGTTT 



GAGATATTTA CGTTC7CACC CCAGATT7CC 
GAAACACATG AAAGCCACAG AAT7TATGTT 
CAGTCTGCTG TATCTAACA? TGCCCAGGCG 
GTACCTGCCA GAGATTATC7 TATATTGAAA 
ATCATTTGCC TTATTATAGT TGTGACACAT 



382 
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CATACTTTAA GCAGGAAAAA GAGAGCAGAC 
C AAAGTGTCTT CCTTCTTAGA 
A AAGTCAAATT AACATCAAAA 
A GATTTTTACA TGGTAGATCA 
CCTTACACTT TGGCTATGAA CAAATAATAA 
GCAAAGGGAA GGGTAAAGTC GGACCAGTST 
AATAGCCCCA AGCAGAGAAA AGGAGGGTAG 
TCATTTAGTT ACTITGATTA ATTTTTCTTT 



AAGAAAGAGA 



CTGTATTAAA 
ACAATTCTTT 
AAATTATTCT 
CAAGGAAAGT 
GTCTGCATTA 
TCTCCTTATC 



ATGGCCTTCG 
ATGCATTGAG 
TTGGGGGTAG 



GTCTTTAAAG 



3000 
3050 



TAACTG7CTG 
TGTGCAGTAC 
TAATGCAAAG 



CTTTGTCTCT 
TCTACTCCCA 
TATAGCCCCN 



TGTGAAGCAA 
AGGTTGCTTG 
CTCTTTACCT 
AGAGATCT7T 
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GDDPYTLQYR 
KPFYINGQNQ 
MFMQSLSSW 
TFSLVQAGDK 



GDDKLLGNCL 
SRISSGTGDI 
FDPDGRKYYT 



CNLKFVTLLV 
YLFNATKRRV 
GCGKEGKYIH 
IKVTRCSSDI 
EFCMASTHNQ 
WCLVLDVSS 
DDRKLLVSYL 



ALSSELPFLG AGVQLQDNGY 
FFRNIKILIP 
FTPNFLLNDN 
TGIFVCEKGP 



FQQHIQLEST 
NNFITNLTFR 
VERDSLHFPH 
IYSRYFFSFA 



KMAEADRLLQ 
PTTVSAKTDI 
HSIALGSSAA 



NGLLIAINPQ 
IKQ-JSYEKAK 
VFVHEWAHLR 
LFKEGCTFIY 
DSADFKHSFP 



TASLWIPGTA 
PVMIYAKVKQ 
ANGRY SLKVH 



GFYPILNATV 



DFDOGQATSY 
QPNGETHESH 
GLIGIICLII 



EIRMSKSLQN 
RIYVAIRAMD 
WTHHTL3RK 



SVLGVPAGPH 
IQDDFNNAIL 
RNSLQSAVSN 



NTHHSLQALK 
TATVEPETGD 
AHSIPGSHAM 
PDVFPPCKII 
VNTSKRNPQQ 
IAQAPLFIPP 
KLL 



IASFDSKGEI 
YGSVMILVTS 
SNSNSMIDAF 
QASGPPEIIL 



DLEAVKVEEE 
AGIREIFTFS 
NSDPVPARDY 



Seq ID NO J S20 DMA s _ 
Nucleic Acid Accession #: NM_00O22 8. 
Coding sequence: B2..3600 



I 

GCTTTCAGGC 
GGATCACCCC 
CTCCTGCATG 
CTTGTTGGGA 
ACCTACTGCA 
CCTCACAACT 



TTCCAGCTTC 
GAGCGCTCCT 
ACCTCCACCT 
CAGTCCCTGC 
ATGGATTTAG 
ATCACAAACT 
CCTCCCAGCG 



ATTGGCTGAA 
CCCAACAAGC 
GGACCCGGTT 
CCCAGTATGG 
ACTACAGTCA 
CCCAGAATGA 
AAGAAGTCAT 
CAGACTTCGG 



AAGAACGGCA 



CTGCTCCCGT 



TTCTTCCTCT T 



ATCCACCTGT 



T GCAAGTGTGA 



CCTGCCTGGC 
TGGGGACCTG 
CAAGCCTGAG 



ACAACAACCG 
ACTGCAATGG 
GGGCATATGG 
GGTGTCAGCT 
TCTCCTGCGA 
GGCAGTGTGT 
TCACTGGACT 



GATGGAGTTC 
TAAGACCTGG 
CCGCCAGGGT 
TAATGCACGC 
TCCAGCAACT 
TTTCACCAGG 
TGTGTCCCAG 
ACCCAAGCCT 
CTGCCAGCAC 



CTAAATGGGG 



AGCTGGACCT 
TGCCCGCCGG 
AGTACC7GGC 
GCTGGCAGGA 



CCCCATGCGC 



CATGCTGATT 
TGCCGACTGC 
TGTTCGGTGC 



CTGGCCCCTG 
CTCCGTCTGC 
GGGGCCTCTG 
AACACTGCCG 



AAATTCAAGA G 



AGGGGAGCTG 
CAGGCCCCTC 
GCCCAAATTG 



GCACTCAGAG 
AGGTGTGTGT 
GCACTATTTC 



AGTGGCCAGG GCTGTGAACC 
CAACCAGTTC ACAGGGCAGT 
GCAGCCATCC GCCAGTGTCC 
TGTGACTGTG ATTTCCGGGG 
CTCTGCCGCC CTGGCTTGAC 
CGCTACCCGG TGTGCGTGGC 



GTGCAAGGAG 
CACCTACGCC 
GGACATGCCG 
CAAATGTGAC 

GCCCTGTCGG 
AGACCGGACC 
AACAGAGGGC 



TTGACCCCGC 
GGGACCACAC 
GCCCGGGAGC 
TGCCAGGGGC 



ACATGTCACT 
GACAATTGCC 
CGGAACCGGC 
GATGGGGCAG 
CATGTGCAGG 
AACCCGCAGG 
TGTGACGAGG 
CAGTGTGCTC 
GACCCGCACA 
GAAGGCTTTG 
TATGGAGACG 
CCGGGCTGCG ACAAGGCATC A 



AGAGTGGGCG 
CCTACCACTG 
ACTCCCCTCA 
G7GGCC7GAT 



CGAAGGCAAG 
TTCCATTCAG 
TCCCTGTGAC 
TGACCTATGC 
CTGTGACTGC 
CTGCCTTTGT 
GAAGCTGGCC 
GCCCACAGTG 



CTGCCACCCT T 



GGGCTGGAGG ACCGTGGCCT 
ATCCGAGCAG TTCTCAGCAG 
GCCATCCTCT CCCTCAGGCG 
GAGACGTTGT CCCTTCCGAG 
ACTATGTATC AGAGGAAGAG 
GCCTTCCGGA TGCTGAGCAC 
GACAGCTCGC GCCTTTTGGA 



GGCCTCCCGG 
CCCCGCAGTC 
AACTCTCCAG 



GGAGCAGTTT 



ATCCTAGA7G 
ACAGAGCAGG 
GGCCTGCAGC 
AGTCTTGACA 



ATGTCTTCGT 
ATGGCTTGCA CCCCAATATC 
TGTGGCTCCC GCIGCAGGGG 
CAGGTGGCTG AGCAGCTGCG 



CCAGCTCAGG 
AGGCACCGGC 
GACACCCACC 



CAGTCAGCCC A 



AGCCCCAAGC T7GTGGCCCT GAGGCTGGAG 
TTCAACAAGC TCTGTGGCAA CTCCAGGCAG 
GAGCTATGTC CC 
AGGGCCGGTG GGGCC7TCTT GATGGCGGGG 
GCCCAGCTCC AGCGGACCAG GCAGATGATT 



TGGCACAGCC 2520 



383 
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AGGGCAGCCG AGGAATCTGC CTCACAGATT CAATCCAGTG CCCAGCGCTT GGAGACCCAG 2700 

GTGAGCGCCA GCCGCTCCCA GATGGAGGAA GATGTCAGAC GCACACG3C? CCTAATCCA3 2760 

CAGGTCCGGG ACTTCCTAAC AGACCCCGAC ACTGATGCA3 CCACTATCCA SGAGSTCASC 2820 

GAGGCCGTGC TGGCCCTGTG GCTGCCCACA GACTCAGCTA C7GTTCT3CA GAAGATGAAT 2880 

GAGATCCAGG CCATTGCAGC CAGGCTCCCC AACGTGGACT TGGTGCT3TC CCAGACCAA3 294 0 

CAGGACATTG C3CGTGCCCG CCGGTTGCAG GCTGAGGCT3 AGGAAGCCAG GAGCCGAGCC 3000 

CATGCAGTGG AGGGCCAGGT GGAAGATGTG GTTGGGAACC TGCGGCAGGG GACAGTGGCA 3 06 0 

CTGCAGGAAG CTCAGGACAC CATGCAAGGC ACCAGCCGCT CCCTTCG3CT TATCCAGGAC 3120 

3 AGGTTCAGCA GGTACTGCGG CCAGCAGAAA AGCTGGT3AC AAGCATGACC 3180 

^CTTCTG GACACGGATG GAGGAGCTCC GCCACCAAGC CCGGCAGCAG 3240 

GGGGCAGAGG CAGTCCAGGC CCAGCAGCTT GCGGAAGGTG CCAGCGAGCA GGCATTGAGT 3300 

GCCCAAGAGG GATTTGAGAG AATAAAACAA AAGTATGCTG AGTTGAAGGA CCGGTTGGGT 3360 

GAGCTGTTTG GGGAGACCAT GGAGATGATG GACAGGATGA AAGACATGGA GTTGGAGCTG 3480 

CTGCGGGGCA GCCAGGCCAT CATGCTGCGC TCGGCGGACC TGACAGGACT GGAGAAGCGT 3540 

GTGGAGCAGA TCCGTGACCA CATCAATGGG CGCGTGCTCT ACTATGCCAC CTGCAAGTGA 3 60 0 

TGCTACAGCT TCCAGCCCGT TGCCCCACTC ATCTGCCGCC TTTOCTTTTG GTTGGGGGCA 3660 

GATTGGGTTG GAATGCTTTC CATCTCCAGG AGACTTTCAT GCAGCCTAAA GTACAGCCTG 3720 

GACCACCCCT GGTGTGTAGC TAGTAAGATT ACCCTGAGCT GCAGCTGAGC CTGAGCCAAT 3780 

GGGACAGTTA CACTTGACAG ACAAAGATGG TGGAGATTG3 CATGCCATTG AAACTAAGAG 3840 

CTCTCAAGTC AAGGAAGCTG GGCTGGGCAG TATCCCCCGC CTTTAGTTCT CCACTGGGGA 3900 

GGAATCCTGG ACCAAGCACA AAAACTTAAC AAAAGTGATG TAAAAATGAA AAGCCAAATA 3960 



MEPPPLLCPA LPGLLHAQQA CSRGACYPPV GDLLVGRTRP LRASSTCGLT KPETYCTQYG 
EWQMKCCKCD SRQPHNYYSH RVENVASSSG PMRWWQSQND VNPVSLQLDL DRRFQLQEVM 
MEFQGPMPAG MLIERSSDFG KTWRVYQYLA ADCTSTPPRV RQGRPQSWQD VRCQSLPQRP 
NARLNGGKVQ LNLMDLVSGI PATQSQKIQE VGEITNLRVN FTRLAPVPQR GYHPPSAYYA 



300 



HYFRNRRPGA SIQETCISCE CDPDGAVPGA PCDPVTGQCV CKEHVQGERC DLCKPGPTGL 420 
TYANPQGCHR CDCNILGSRR DMPCDEESGR CLCLPNWGP KCDQCAPYHW KLASGQGCEP 480 
CACDPHNSPQ PTVQPVHRAV PCREGFGGLM CSAAAIRQCP DRTYGDVATG CRACDCDFRG 540 
TEGPGCDKAS GRCLCRPGLT GPRCDQCQRG YCNRYPVCVA CHPCFQTYDA DLREQALRPG 600 
RLRNATASLW SGPGLEDRGL ASRILDAKSK IEQIRAVLSS PAVTEQEVAQ VASAILSLRR 660 
TLQGLQLDLP LEEETLSLPR DLESLDRSFN GLLTMYQRKR EQFEKISSAD PSGAFRMLST 720 
AYEQSAQAAQ QVSDSSRLLD QLRDSRREAE RLVRQAGGGG G7GSPKLVAL RLEMSSLPDL 780 
TPTFNKLCGN SRQMACTPIS CPGELCPQDN GTACGSRCRG VLPRAGGAFL MAGQVAEQLR 84 0 
GFNAQLQRTR QMIHAAEESA SQIQSSAQRL ETQVSASRSQ MEEDVRRTRL LIQQVRDFLT 90 0 
DPDTDAATIQ EVSEAVLALW LPTDSATVLQ KMNEIQAIAA RLPNVDLVLS QTKQDIARAR 960 

RLQAEAEEAR SRAHAVEGQV EDWGNLRQG TVALQEAQDT MQGTSRSLRL IQDRVAEVQQ 102 0 

VLRPAEKLVT SMTKQLGDFW TRMEELRHQA RQQGAEAVQA QQLAEGASEQ ALSAQEGFER 1080 

IKQKYAELKD RLGQSSMLGE QGARIQSVKT EAEELFGETM EMMDRMKDME LELLRGSQAI 1140 
MLRSADLTGL EKRVEQIRDH INGRVLYYAT CK 

Seq ID NO: 522 DNA sequence 

Coding sequence: 84.. 3083 

TTTTCTTAGA LtAACTGC L C GGCTGG LgaTAGAA GCAGCGGCTC ACTTGGACTT 

TTTCACCAGG GAAATCAGAG ACAATGATGG GGCTCTTCCC CAGAACTACA GGGGCTCTGG - 120 
CCATCTTCGT GGTGGTCATA TTGGTTCATG GAGAATTGCG AATAGAGACT AAAGGTCAAT 180 
ATGATGAAGA AGAGATGACT ATGCAACAAG CTAAAAGAAG GCAAAAACGT GAATG3GTGA 240 
AATTTGCCAA ACCCTGCAGA GAAGGAGAAG ATAACTCAAA AAGAAACCCA ATTGCCAAGA 300 
TTACTTCAGA TTACCAAGCA ACCCAGAAAA TCACCTACCG AATCTCTGGA GTGGGAATCG 360 
ATCAGCCGCC TTTTGGAATC TTTGTTGTTG ACAAAAACAC TGGAGATATT AACATAACAG 420 
CTATAGTCGA CCGGGAGGAA ACTCCAAGCT TCCTGATCAC ATGTCGGGCT CTAAATGCCC 4 80 
AAGGACTAGA TGTAGASAAA CCACTTATAC TAACGGTTAA AATTTTGGAT ATTAATGATA 540 
ATCCTCCAGT ATTTTCACAA CAAATTTTCA TGGGTGAAAT TGAAGAAAAT AGTGCCTCAA 600 
ACTCACTGGT GATGATACTA AATGCCACAG ATGCAGATGA ACCAAACCAC TIGAATTCTA 660 
AAATTGCCTT CAAAATTGTC TCTCAGGAAC CAGCAGGCAC ACCCATGTTC CTCCTAAGCA 720 
GAAACACTGG GGAAGTCCGT ACTTTGACCA ATTCTCTTGA CCGAGAGCAA GCTAGCAGCT 780 
ATCGTCTGGT TGTGAGTGGT G CAGAC AAAG ATGGAGAAGG ACTATCAACT CAATGTGAAT 840 
GTAATATTAA AGTGAAAGAT GTCAACGATA ACTTCCCAAT GTTTAGAGAC TCTCAGTATT 900 
CAGCACGTAT TGAAGAAAAT ATTTTAAGTT CTGAATTACT TCGATTTCAA GTAACAGATT 960 

TGGATGAAGA GTACACAGAT AATTGGCTTG CAGTATATTT CTTTACCTCT GGGAATGAAG 1020 

GAAATTGGTT TGAAATACAA ACTGATCCTA GAACTAATGA AGGCATCCTG AAAGTGGTGA 1080 

AGGCTCTAGA TTATGAACAA CTACAAAGCG TSAAACTTAG TATTGCTGTC AAAAACAAAG 1140 

CTGAATTTCA CCAATCAGTT ATCTCTCGAT ACCGAGTTCA GTCAACCCCA GTCACAATTC 1200 

AGGTAATAAA TGTAAGAGAA GGAATTGCAT TCCGTCCTGC TTCCAAGACA TTTACTGTGC 12 60 

AAAAAGGCAT AAGTAGCAAA AAATTGGTGG ATTATATCCT GGGAACATAT CAAGCCATCG 1320 

ATGAGGACAC TAACAAAGCT G CCTCAAATG TCAAATATGT CATG3GACGT A^CGATGGTG 13 80 

GATACCTAAT GATTGATTCA AAAACTG CTG AAATCAAATT TGTCAAAA^T ATGAACCGAG 1440 

ATTCTACTTT CATAGTTAAC AAAACAATCA CAGCTGAGGT TCTGGCCATA GATGAATACA 1500 
CGGGTAAAAC TTCTACAGGC ACGGTATATG TTAGAGTACC CGATTTCAAT G 
CAACAGCTGT CCTCGAAAAA GATGCAGTTT GCAGTTCTTC ACCTTCC 

CTAGAACACT GAATAATAGA TACACTGGCC CCTATACATT TGCACTGGAA GATCA.ACCTG 1680 

TAAAGTTGCC TGCCGTATGG AGTATCACAA CCCTCAATGC TACCTCGGCC CTCCTCAGAG 1740 

CCCAGGAACA GATACCTCCT GGAGTATACC ACATCTCCCT GGTACTTACA GACAGTCAGA 1800 

ACAATCGGTG TGAGATGCCA CGCAGCTTGA CACTGGAAGT CTGTCAGTGT GACAACAGGG 1860 
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GCATCTGTGG AACTTCTTAC CCAACCACAA GCCCTGGGAC CAGGTATGGC AGGCCGCACT 1920 

CAGGGAGGCT GGGGCCTGCC GCCATCGGCC TGCTGCTCCT TGGTCTCCTG CTGCTGCTGT 1980 

TGGCCCCCCT TCTGCTGTTG ACCTGTGACT GTGGGGCAGG TTCTACT3GG GGAGTGACAG 2040 

GTGGTTTTAT CCCAGTTCCT GATGGCTCAG AAGGAACAAT TCATCAGTGG GGAArTGAAG 2100 

GAGCCCATCC TGAAGACAAG GAAATCACAA ATATTTGTGT GCCTCCTGTA ACAGCCAATG 2160 

GAGCCGATTT CATGGAAAGT TCTGAAGTTT GTACAAATAC G7ATGCCAGA GGCACAGCGG 2220 

TGGAAGG CAC TTCAGGAATG GAAATGACCA CTAAGCTTGG AGCAGCCAC7 GAATCTGGAG 2280 

GTGCTGCAGG CTTTGCAACA GGGACAGTGT CAGGAGCTGC T7CAGGATTC G3AGCAGCCA 234 0 

CTGGAGTTGG CATCTGTTCC TCAGGGCAGT CTGGAACCAT GAGAACAAGG CATTCCACTG 2400 

GAGGAACCAA TAAGGACTAC GCTGATGGGG CGATAAGCAT GAATTTTCTG GACTCCTACT 2460 

TTTCTCAGAA AGCATTTGCC TGTGCGGAGG AAGACGATGG CCAGGAAGCA AATGACTGCT 2S20 

TGTTGATCTA TGATAATGAA GGCGCAGATG CCACTGGTTC TCCTGTGGGC TCCGTGGGTT 2580 

GTTGCAGTTT TATTGCTGAT GACCTGGATG ACAGCTTCTT GGACTCACTT GGACCCAAAT 2 64 0 

TTAAAAAACT TGCAGAGATA AGCCTTGGTG TTGATGGTGA AGGCAAAGAA GTTCAGCCAC 2700 

CCTCTAAAGA CAGCGGTTAT GGGATTGAAT CCTGTGGCCA TCCCATAGAA GTCCAGCAGA 2760 

CAGGATTTGT TAAGTGCCAG ACTTTGTCAG GAAGTCAAGG AGCTTCTGCT TTGTCCGCCT 2820 

CTGGGTCTGT CCAGCCAGCT GTTTCCATCC CTGACCCTCT GCAGCATGG? AACTATTTAG 2880 

TAACGGAGAC TTACTCGGCT TCTGGTTCCC TCGTGCAACC TTCCACTGCA GGCTTTGATC 2940 

CACTTCTCAC ACAAAATGTG ATAGTGACAG AAAGGGTGAT CTGTCCCAT? TCCAGTGTTC 3000 

CTGGCAACCT AGCTGGCCCA ACGCAGCTAC GAGGGTCACA TACTATGCTC TGTACAGAGG 3060 

ATCCTTGCTC CCGTCTAATA TGACCAGAAT GAGCTGGAAT ACCACACTGA CCAAATCTGG 3120 

ATCTTTGGAC TAAAGTATTC AAAATAGCAT AGCAAAGCTC ACTGTATTGG GCTAATAATT 3180 

TGGCACTTAT TAGCTTCTCT CATAAACTGA TCACGATTAT AAATTAAATG TTTGGGTTCA 324 0 

TACCCCAAAA GCAATATGTT GTCACTCCTA ATTCTCAAGT ACTATTCAAA TTGTAGTAAA 3300 
TCTTAAAGTT TTTCAAAACC CTAAAATCAT ATTCGC 

Seq ID NO: 523 Protein sequence 
Protein Accession #: NP_001935.1 

1 11 21 31 41 51 

I I I I I I 

MMGLFPRTTG ALAIFWVIL VHGELRIETK GQYDEEEMTM QQAKRRQKRE WVKFAKPCRE 6 0 

GEDNSKRNPI AKITSDYQAT QKITYRISGV GIDQPPFGIF WDKNTGDIN ITAIVDREET 120 

PSFLITCRAL NAQGLDVEKP LILTVKILDI NDNPPVFSQQ IFMGEIEENS ASNSLVMILN 180 

ATDADEPNHL NSKIAFKIVS QEPAGTPMFL LSRNTGEVRT LTNSLDREQA SSYRLWSGA 240 

DKDGEGLSTQ CECNIKVKDV NDNFPMFRDS QYSARIEENI LSSELLRFQV TDLDEEYTDN 300 

WLAVYFFTSG NEGNWFEIQT DPRTNEGILK WKALDYEQL QSVKLSIAVK NKAEFHQSVI 360 

SRYRVQSTPV TIQVINVREG IAFRPASKTF TVQKGISSKK LVDYILGTYQ AIDEDTNKAA 420 

SNVKYVMGRN DGGYLMIDSK TAEIKFVKNM NRDSTFIVNK TITAEVLAID EYTGKTSTGT 480 

VYVRVPDFND MCPTAVLEKD AVCSSSPSW VSARTLNHRY TGPYTFALED QPVKLPAVWS 540 

ITTLNATSAL LRAQEQIPPG VYHISLVLTD SQNNRCEMPR SLTLEVCQCD NRGICGTSYP 600 

TTSPGTRYGR PHSGRLGPAA IGLLLLGLLL LLLAPLLLLT CDCGAGSTGG VTGGFIPVPD 660 

GSEGTIHQWG IEGAHPEDKE ITNICVPPVT ANGADFMESS EVCTNTYARG TAVEGTSGME 72 0 

MTTKLGAATE SGGAAGFATG TVSGAASGFG AATGVGICSS GQSGTMRTRH STGGTNKDYA 780 

DGAISMNFLD SYFSQKAFAC AEEDDGQEAN DCLLIYDNEG ADATGSPVGS VGCCSFIADD 840 

LDDSFLDSLG PKFKKLAEIS LGVDGEGKEV QPPSKESGYG IESCGIIPIEV QQTGFVKCQT 900 

LSGSQGASAL SASGSVQPAV SIPDPLQHGN YLVTETYSAS GSLVQPSTAG FDPLLTQNVI 960 
VTERVICPIS SVPGNLAGPT QLRGSHTMLC TEDPCSRLI 

Seq ID NO; 524 DNA sequence 



I I I I I I 

ATGAAGTTTC TTCTAATACT GCTCCTGCAG GCCACTGCTT CTGGAGCTCT TCCCCTGAAC 60 

AGCTCTACAA GCCTGGAAAA AAATAATGTG CTATTTGGTG AAAGATACTT AGAAAAATTT 120 

TATGGCCTTG AGATAAACAA ACTTCCAGTG ACAAAAATGA AATATAGTGG AAACTTAATG 180 

AAGGAAAAAA TCCAAGAAAT GCAGCACTTC TTGGGTCTGA AAGTGACCGG GCAACTGGAC 24 0 

ACATCTACCC TGGAGATGAT GCACGCACCT CGATGTGGAG TCCCCGATGT CCATCATTTC 300 

AGGG AAATG C CAGGGGGGCC CGTATGGAGG AAACATTATA TCACCTACAG AATCAATAAT 3 60 

TACACACCTG ACATGAACCG TGAGGATGTT GACTACGCAA TCCGGAAAGC TTTCCAAGTA 420 

TGGAGTAATG TTACCCCCTT GAAATTCAGC AAGATTAACA CAGGCATGGC TGACATTTTG 4 80 

GTGGTTTTTG CCCGTGGAGC TCATGGAGAC TTCCATGCTT TTGATGGCAA AGGTGGAATC 54 0 

CTAGCCCATG CTTTTGGACC TGGATCTGGC ATTGGAGGGG ATGCACATTT CGATGAGGAC 600 

GAATTCTGGA CTACACATTC AGGAGGCACA AACTTGTTCC TCACTGCTGT TCACGAGATT 660 

GGCCATTCCT TAGGTCTTGG CCATTCTAGT GATCCAAAGG CCGTAATGTT CCCCACCTAC 72 0 

AAATATGTTG ACATCAACAC ATTTCGCCTC TCTGCTGATG ACATACGTGG CATTCAGTCC 780 

CTGT AT GGAG ACCCAAAAGA GAACCAACGC TTGCCAAATC CTGACAATTC AGAACCAGCT 84 0 

CTCTGTGACC CCAATTTGAG TTTTGATGCT GTCACTACCG TGGGAAATAA GATCTTTTTC 900 

TTCAAAGACA GGTTCTTCTG GCTGAAGGTT TCTGAGAGAC CAAAGACCAG TGTTAATTTA 960 

ATTTCTTCCT TATGGCCAAC CTTGCCATCT GGCATTGAAG CTGCTTA7GA AATTGAAGCC 1020 

AGAAATCAAG TTTTTCTTTT TAAAGATGAC AAATACTGGT TAATTAGCAA TTTAAGACCA 10 80 

GAGCCAAATT ATCCCAAGAG CATACATTCT TTTGGTTTTC CTAACTT7GT GAAAAAAATT 1140 

GATGCAGCTG TTTTTAACCC ACGTTTTTAT AGGACCTACT TCT7TGTAGA TAACCAGTAT 1200 

TGGAGGTATG ATGAAAGGAG ACAGATGATG GACCCTGGTT ATCCCAAACT GATTACCAAG 12 60 

AACTTCCAAG GAATCGGGCC TAAAATTGAT GCAGTCTTCT ACTCTAAAAA CAAATACTAC 1320 

TATTTCTTCC AAGGATCTAA CCAATTTGAA TATGACTTCC TAC7CCAACG TATCACCAAA 13 80 
ACACTGAAAA GCAATAGCTG GTTTGGTTGT TGA 
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YTPDMNEEDV DYAIRKAFQV WSNVTPLKPS KINTGMADIL WFARGAHGD F 

LAHAFGPGSG IGGDAHFDED EFWTTHSGGT NLFLTAVHEI GHSLGLGHSS DPKAVKFPTY 

KYVDINTFRL SADDIRGIQS LYGDPKENQR LPNPDNSEPA LCDPNLSFDA VTTVGNKIFF 

FKDRFFWLKV SERPKTSVNL ISSLWPTLPS GIEAAYEIEA RNQVFLFKDD KYWLISNLRP 

EPNYPKSIHS FGFPNFVKKI DAAVFNPRFY RTYFFVDNQY WRYDERRQMM DPGYPKLITK 

NFQGIGPKID AVFYSKNKYY YFFQGSNQFE YDFLLQRITK TLKSNSWFGC 
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20 

25 



CCGATGGCCG 



GCTCTCGGCA CCCTCCCGG 



GACAAAAGGA 
TCGAAGACAA 
ATTCCTTGCT 
GAATCTGATG 
AAAGAACCTT 
CCTGTGGATC 
GGATATTCAG 
CACCCTGTTT 
ACTACAGTGG 
CTGAAATACA 
AGCACAGGCG 



GAAAATGGAC 
GTAAAGCCAC 
GAAGCGCCAT 
GTTCATGTGA 



AATAGAAATG 



ACTGGAACAC 



GAAAGAGGT' 
AGAAACTGTT 
GAATTCCTTG 
CTATACTGTC 
TTATATAGAA 
TGATGTTTTT 
CCTCCCACTA 
AATTTATAAT 
TGCCACAGAC 
GCAGACACCA 
AGTCTCTCAT 
AGACATGGAT 
AGATTCAAAT 
AAATGCATTC 
TGCCAATTGG 
CAGCACAGAC 
AGAAAACCGT 
TATTCCCAGA 
TGAGGGGCCT 
ACTTAGCAGT GGGGTCAAAG 
AAGGTACAAA 
AATCATAACT 
TATTACAGTC 
CATTGAAGAT 



GTGAAGAATA 
CAGATCTGCC 
TCACAGAAGC 

GCATTTTGCA 
TAATCACCAC 
TGAAAGTACA 
TAACAGTAAC 
TTGTAGAGGA 
TAATTAACAC 
ATTT CAAAAT 
TGAATTATGA 
TTGCTAGAGA 



TTCTACTCAA 
AGAGACACTG 
GATTTGATTG 
CCCATCAGGG 
TTTGAAGTTT 
AGAGATGAAC 



TGGAGTTGAT 
GAAATCTATT TTGCACTCGG 
CTTATGCGTC AACTGCAGAT 



7GGAAAGTAG 



GGCTCTTTTC 
GAGAGGTTG7 
77GGA7TGA7 
CCACTTTCAG 
7CTTACGAA7 
7TACCATTT7 



TAGACCTGGT 
GCATACGCGC 
TGTGCATCCC 
AGACAAGTAC 
AGGCACATCA 



ACCTATAGAA 



AGTTGTATAA 



GATGAACCTG 
AGTAGACTGT 
AATGCTGGAT 
GCAACAAAAT 
ACTTCAAGGA 
ATAGCACTGC 



CCTGGAGACG 



GAAATGATGA 
ACCCTGGACT 
GAGTGG CACA 
TAAAAATTAA 
CCAAGATTAT 
CTGCTGCAGT 



TTCCTGAAGA 
ATAGAGTGTG 
TTTGTGGTAC 



TCCATTTTAT 
CAAAGTTAAT 
TACCATTCCT 
TAATCTGTGT 
AATACTTGGA 
ATTGCTAACT 



CTCTGCCAAT 



CCTGCAGGGG 



AGT AT CACTA 
ATGTTGCAGC 
TTGGAGGCAA 
GGAAATAAAT 



ACATAAAAGA 
GTCCTCACTT 
GAAAAGCAGG 
TTAG CAGAAG 
ATTCTGGAGG 
GATTTTTTTC 

TGCTTATCTT 
CAGCACTGGA 
TAATAAATAT 
ATTATGTATT 
TGTGAAGAAA 
TCATAAAGAA 



CCAGACCTTG 
AGGACACACG 
ACCCCGTCTC 



CAAGTGAACC 
GTGACAGCCT 
GAATGCACTC 
ATCAACGGCT 
AAATTGCATG 
TCCAAAATCC 
CTGGCAATAG 
GTAAATGATA 
GGGTATACCG 
TTCAGTTTGC 
3 AT ACAG CTG 
ATTACTGTAA 
GAATGTACTC 
AAATGGGCAA 
TTAGTATGTG 
CAAAACTTAA 
GGATTTATGA 
GGAATGAAAA 
GAATCCTGCC 
GAGGTGGACA 
GGTGAAGAAT 



TCTTTCTGTT 
AGTAAACAAT 
CTTGGTTACA 
ATATGTGCGG 
ATAAGGCATA TGACCCCGAA 
ATCCTAAAGG TTGGATCACC 
7GGA7AGGGA GG7TGAAACT 
ACAAAGATGA 
ATCCACCAGA 
ACATTTTAGC 
CCAATACTTC 
CCCG7CTTTC ATATCAGAAA 
AAGACAGGGC CGGCCAAGCT 



GAGT7TTTGG 
TTATATCAAA 
CCCAAACTAC 
ATGGAGGGCA 



GTGTCGTGCG 
ATTACTGGGT 
TGCAACTAAA 



ATAACTATGA G 



ACTGCAGATA CACTTACTCG 
CCAT7AGAGG ACACACTGGT 
AATGAAGACC GCATGCCATC 
TCTCCAGCTG GTTCTGTGGG 
TTAAATAATT 



ATTAAGGTCT 
GCTGGATAAA 
CACTTTAAGT 



ATTATGCTAC 
TGAAAAATGT 
CTAAAGCATC 
TATTAGTCCA 
GATAGTTTAA 



GTGTGTGTGT 
AAAATGGTAA 
AGAGCTTCCT 



AAACTTGAAA 
AGGCCTGGGC 
ATTTGTGTAA 
AAATAGGGAA 
CCACAAGTTA 



CCCCTACTGC 
GAAGTAGCAA 
TAATCAATGC 



AAG77CAA77 
TCACCAATTT 
TAAAACAGAC 
7GC7C7TT7T 
ACAATAGCTA 
AAAATAAACA 
AAGACTGAAT 
ACTACCAAAT 
77T7C7A7AG 



TCAACATGTA 
ATATTTTTAA 
AACTGGTAAA 



ATAACAAAAA CATTTTAAAA 
A TAGTGACCAA 
3 CTAGAGGGAG 
A AACCTAAGCC 
T CTCCTCACTG CCCTTCTTCT 



CATCTTTTTA 
CTGAGGGGAG 
CCACAAACTT 



TCTTAAATGC 
TTTGTTTAAA 
TCCAATGGAA 
GTAGCAAACT 
CAGAGATGAG 
CTGAAGTTAA 
ATTTAGATCC 



GTATAGTTTG 
7GCA77A7AA 
ATTGTAAATA 
CAG7AGC777 



TCATTTGACT 
GAATATAGTT 
ATGAAATGAG 
TCCTACAATA 
CTGAGTCTAT 
AATTAAACTT 
GC77TGCAG7 
CGCTGCAGCT 



C TTCAGCGTGA 
A AGAAATAGAA 
G GAACTT7GGG 



GCTGTTTCTA 
AAATAACCAT GTCCTCCTAG 
AAAGCACCCT GGGGAGA77G 
CAGGTCTGGG AGCTACAAAA 
GGCC7GAA7C AAGGAAAGCC 
GCAGAGATTC 



2280 
2340 
2400 
2460 

2S60 
2640 



3000 
3060 
3120 
3180 



3360 

3460 

3600 
3660 
3720 
3780 



C AACAGGAAGA TGCAGGCCTT 



AGAAAGCAGC CCAAGTAGGT 
CAAGGGCAAG GAGAGGCCAC 
lTACTTTTTC C 
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CACTGCCTTT TCCTTTCTCA GGCCAATGGC AACTGCCATT TGAGTCCGGT GAGGGATCAG 4S00 

CCAACCTCTT CTCTATGGCT CACCTTATTT GGAGTGAGAA ATCAAGGAGA CAGAGCTGAC 4560 

TGCATGATGA GTCTGAAGGC ATTTGCAGGA TGAGCCTGAA CTG3TTGTGC AGAACAAACA 4620 

AGGCATTCAT GGGAATTGTT GTATTCCTTC TGCAGCCCTC CTTCTGGGCA CTAAGAAGGT 4680 

CTATGAATTA AATGCCTATC TAAAATTCTG ATTTATTCCT ACATTTTCTG TTTTCTAATT 4740 

TGACCCTAAA ATCTATGTGT TTTAGACTTA GACTTTTTAT TGCCCCCCCC CCCTTTTTTT 4 800 

TTGAGACGGA GTCTCGCTCT GACGCACAGG CTGGAGTGCA G7GGCTCCGA TCTCTGCTCA 4860 

CTGAAAGCTC CGCCTCCCGG GTTCATGCCA TTCTCCTGCC TCAGCCTCCT GAGTAGCTGG 4920 

GACTAGAGGC GCCCACCACC ACGCCCGGCT AATTTTTTGT ATTTTTAATA GAGACGGGGT 4 980 

TTCACTGTGT TAGCCAGGAT GGTCTCGATC TCCTGACCTC 3TGATCCGO TGCCTCGGCC 5040 

TCCCAAAGTG CTGGGATTAC AGGCATGACC CACCGCTCCC GGCCTTGTTT TCCGTTTAAA 5100 

GTCGTCTTCT TTTAATGTAA TCATTTTGAA CATGTGTGAA AGTTGATCAT ACGAATTGGA 5160 

TCAATCTTGA AATACTCAAC CAAAAGACAG TCGAGAAGCC AGGGGGAGAA A3AACTCAGG 5220 

GCACAAAATA TTGGTCTGAG AATGGAATTC TCTGTAAGCC TAGTTGCTGA AATTTCCTGC 5280 

TGTAACCAGA AGCCAGTTTT ATCTAACGGC TACTGAAACA CCCACTGTGT TTTGCTCACT 5340 

CCCACTCACC GATCAAAACC TGCTACCTCC CCAAGACTTT ACTAGTGCCG ATAAACTTTC 5400 

TCAAAGAGCA ACCAGTATCA CTTCCCTGTT TATAAAACCT C7AACCATCT CTTTGTTCTT 5460 

TGAACATGCT GAAAACCACC TGGTCTGCAT GTATGCCCGA ATTTGTAATT CTTTTCTCTC 5520 

AAATGAAAAT TTAATTTTAG GGATTCATTT CTATATTTTC ACATATGTAG TATTATTATT 5580 

TCCTTATATG TGTAAGGTGA AATTTATGGT ATTTGAGTGT GCAAGAAAAT ATATTTTTAA 5640 

AGCTTTCATT TTTCCCCCAG TGAATGATTT AGAATTTTTT ATGTAAATAT ACAGAATGTT 5700 

TTTTCTTACT TTTATAAGGA AGCAGCTGTC TAAAATGCAG TGGGGTTTGT TTTGCAATGT 5760 

TTTAAACAGA GTTTTAGTAT TGCTATTAAA AGAAGTTACT TTGCTTTTAA AGAAACTTGG 5820 

CTGCTTAAAA TAAGCAAAAA TTGGATGCAT AAAGTAATAT 7TACAGATGT GGGGAGA7GT 5880 

AATAAAACAA TATTAACTTG GCTGCTTAAA ATAAGCAAAA ATTGGATGCA TAAAGTAATA 5940 

TTTACAGATG TGGGGAGATG TAATAAAACA ATATTAACTT GGTTTCTTG- TTTTGCTGTA 6000 

TTTAGAGATT - AAATAATTCT AAGATGATCA CTTTGCAAAA TTATGCTTA7 GGCTGGCATG 6060 

GAAATAGAAA TACTCAATTA TGTCTTTGTT GTATTAATGG GGAATATTTT GGACAATGTT 6120 

TCATTATCAA ATTGTCGACA TCATTAATAT ATATTGTAAT GTTGGGAAGA GATCACTATT 6180 

TTGAAGCACA GCTTTACAGA TGAGTATCTA TGATACATAT GTATAATAAA TTTTGATCGG 6240 

GTATTAAAAG TATTAGAAGG TGGTTATAAT TGCAGAGTAT TCCATGAATA GTACACTGAC 6300 

ACAGGGGTTT TACTTTGAGG ACCAGTGTAG TCAAGGGAAA ACATGAGTTA AAAAGAAAAG 6360 

CAGGCAATAT TGCAGTCTTG ATTCTGCCAC TTACAGGATA GATAATGCCT GAACTTTAAT 6420 

GACAAGATGA TCCAACCATA AAGGTGCTCT GTGCTTCACA GTGAAT CTTT TCCCCATGCA 6480 

GGAGTGTGCT CCCCTACAAA CGTTAAGACT GATCATTTCA AAAATCTATT AGCTATATCA 6540 

AAAGCCTTAC ATTTTAATAT AGGTTGAACC AAAATTTCAA TTCCAGTAAC TTCTATTGTA 6600 

ACCATTATTT TTGTGTATGT CTTCAAGAAT GTTCATTGGA TTTTTGTTTG TAATAGTAAA 6660 

ATACCGGATA CATTTCACGT GTCCTTCAGT ATTGATTTGG TTGAATATTG GGTCATAATG 6720 

GTTGAGAAGC ATGGACACTA GAGCCAGAAT GCTTGGATAT GAATCCTGGA TCTGTCACTT 6780 

ACTTCTGTGT GACCTTTGAA AGGCTACTTA TTTCCTCTCT TAGCTTTCTC ATTAAAA7CA 6 840 

ATGAACAATG CCAGCCTCAT GGGGTTGTTG AATGATTAAA TTAGTTAATA TACCTAAAGT 6900 

ACATAGAACA CTGCCTGCAC ATAGTAAAAG AATTATAAGT GTGAGGTAG7 TGGTAAAATT 6 960 

ATGTAGTTGG ATATACTACC GAACAATATC TAA7CTC7TT TTAGGGAAA7 AAAGTTTGTG 7020 
CATATATATA ATCCCGAAAC ATG 



MAAAGPRRSV RGAVCLHLLL TLVIFSRDGE ACKKVILNVP SKLEADKIIG RWLEECFRS 

ADLIRSSDPD FRVLNDGSVY TARAVALSDK KRSFTIWLSD KRKQTQKEV7 VLLEHQKKVS 

KTRHTRETVL RRAKRRWAPI PCSMQENSLG PFPLFLQQVE SDAAQNYTVF YSISGRGVDK 

EPLNLFYIER DTGNLFCTRP VDREEYDVFD LIAYASTADG YSADLPLPLP IRVEDENDNH 

PVFTEAIYNF EVLESSRPGT TVGWCATDR DEPDTMHTRL KYSILQQTPR SPGLFSVHPS 

TGVITTVSHY LDREWDKYS LIMKVQDMDG QFFGLIGTST CIITVTDSND NAPTFRQNAY 

EAFVEENAFN VEILRIPIED KDLINTANWR VNFTILKGNE NGHFKISTDK ETNEGVLSW 

KPLNYEENRQ VNLEIGVNNE APFARDIPRV TALNRALVTV HVRDLDEGPE CTPAAQYVRI 

KENLAVGSKI NGYKAYDPEN RNGNGLRYKK LHDPKGWITI DEISGSIITS KILDREVETP 

KNELYNITVL AIDKDDRSCT GTLAVNIEDV NDNPPEILQE YWICKPKMG YTDILAVDPD 

EPVHGAPFYF SLPNTSPEIS RLWSLTKVND TAARLSYQKN AGFQEYTIPI TVKDRAGQAA 

TKLLRVNLCE CTHPTQCRAT SRSTGVILGK WAILAILLGI ALLFSVLLTL VCGVFGATKG 

KRFPEDLAQQ NLIISNTEAP GDDRVCSANG FMTQTTNNSS QGFCGTMGSG MKHGGQETIE 

MHKGGNQTLE SCRGAGHHHT LDSCRGGHTE VDNCRYTYSE WHSF7QPRLG EESIRGHTG 

Seg ID NO: 52 8 



CTOACCCTCG TGATCTTCAG TCGTGATGGT GAAGCCTGCA AAAAGGTGAT ACTTAATGTA 
CCTTCTAAAC TAGAGGCAGA CAAAATAATT GGCAGAGTTA ATTTGGAAGA GTGCTTCA3G 
TCTGCAGACC TCATCCGGTC AAGTGATCCT GATTTCAGAG TTCTAAATGA TGGGTCAGTG 
TACACAGCCA GGGCTGTTGC GCTGTCTGAT AAGAAAAGAT CATT7ACCAT ATGGCTTTC'i 
GACAAAAGGA AACAGACACA GAAAGAGGTT ACTGTGCTGC TAGAACATCA GAAGAAGGTA 
TCGAAGACAA GACACACTAG AGAAACTGTT CTCAGGCGTG CCAAGAGGAG ATGGGCACGT 
ATTCCTTGCT CTATGCAAGA GAATTCCTTG GGCCCTTTCC CATTGTTTCT TCAACAAGTT 
GAATCTGATG CAGCACAGAA CTATACTGTC TTCTAC7CAA TAAGTGGACG TGGAGTTGAT 
AAAGAACCTT TAAATTTGTT TTATATAGAA AGAGACACTG GAAATCTATT T7GCACTCGG 
CCTGTGGATC GTGAAGAATA TGATGTTTTT GATTCGATTG CTTATGCGTC AACTGCAGAT 
GGATATTCAG CAGATCTGCC CCTCCCACTA CCCATCAGGG TAGAGGATGA AAATGACAAC 
CACCCTGTTT TCACAGAAGC AATTTATAAT TTTGAAGTTT TGGAAAGTAG TAGACCTGGT 
ACTACAGTGG GGGTGGTTTG TGCCACAGAC AGAGATGAAC CGGACACAAT GCATACGCGC 
CTGAAATACA GCATTTTGCA GCAGACACCA AGGTCACCTG GGCTCTTTTC TGTGCATCCC 
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AGCACAGGCG TAATCACCAC 
TCATTGATAA TGAAAGTACA 
ACTTGTATCA TAACAGTAAC 
TATGAAGCAT TTGTAGAGGA 
GATAAGGATT TAATTAACAC 
GAAAATGGAC ATTTCAAAAT 
GTAAAGCCAC TGAATTATGA 
GAAGCX3CCAT TTGCTAGAGA 
GTTCATGTGA GGGATCTGGA 
ATTAAAGAAA ACTTAGCAGT 
AATAGAAATG GCAATGGTTT 
ATTGATGAAA TTTCAGGGTC 
CCCAAAAATG AGTTGTATAA 
ACTGGAACAC TTGCTGTGAA 
GAATATGTAG TCATTTGCAA 
GATGAACCTG TCCATGGAGC 
AGTAGACTGT GGAGCCTCAC 
AATGCTGGAT TTCAAGAATA 
GCAACAAAAT TATTGAGAGT 
ACTTCAAGGA GTACAGGAGT 
ATAGCACTGC TCTTTTCTGT 
GGGAAACGTT TTCCTGAAGA 
CCTGGAGACG ATAGAGTGTG 
AGCCAAGGTT TTTGTGGTAC 
GAAATGATGA AAGGAGGAAA 
ACCCTGGACT CCTGCAGGGG 
GAGTGGCACA GTTTTACTCA 



AGTCTCTCAT T 



AGATTCAAAT 
AAATGCATTC 
TGCCAATTGG 



GGGGTCAAAG 



AATGTGGAAA TCTTACGAAT 
AGAGTCAATT 
AAAGAAACTA 
CAAGTGAACC 
GTGACAGCCT TGAACAGAGC 
GAATGCACTC CTGCAGCCCA 
ATCAACGGCT ATAAGGCATA 
AAATTGCATG ATCCTAAAG3 



T AGACAAGTAC 
T AGGCACATCA 
G ACAAAATGCT 
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AAAGGGAAAT 
AGTAAACAAT 



CATTGAAGAT 



CTGGCAATAG 
GGGTATACCG 



GGTTGAAACT 



T GATACAGCTG 



ATCCACCAGA 
ACATTTTAGC 
CCAATACTTC 



CCAGCTGGTT 
AATAATTTGG 
AGTGCTACAA 
TTCAATTTCA 
CCAATTTATA 
AACAGACAAC 
TCTTTTTTTT 
ATAGCTAAGT 
ATAAACAAGA 
ACTGAATTAA 



TCTATAGGAA 
ATTTAAAATG 
TAGTTTGTCC 
ATTATAACTG 
GTAAATAAAT 
TAGCTTTGCT 
GAATACTCGC 
TTTTTTTCGG 



ATTGCTAACT 
TTTAGCACAG 
CTCTGCCAAT 
TATGGGATCA 
CCAGACCTTG 
AGGACACACG 
ACCCCGTCTC 
AGATTATGTC 
CTGCAGTGAA 
TATTACATTA 
GTCAGACATT 
AT AT GATGAT 
CAGTTGTTGC 
CAAACTCCAG 
ATTTTAGTAA 
TCACATTATT 
ATCACTATGT 
TTGCAGCTCA 
TTTGACTTTG GAGGCAAAAT 
TATAGTTGGA AATAAATGTG 
AAATGAGAAC 



GAATGTACTC 
AAATGGGCAA 
TTAGTATGTG 



AAGACAGGGC 



AATACTTCAA 
TGTTGATCCT 
TCCAGAAATC 
ATATCAGAAA 
CGGCCAAGCT 



TCCTTGCAAT A 



AACCCAAATT 
TTAGGTCTTT 
ACATGTATGT 
TTTTTAAAGC 
TGGTAAATCT 



GGATTTATGA 
GGAATGAAAA 
GAATCCTGCC 
GAGGTGGACA 
GGTGAAAAAT 
CTCACTTATA 
AAGCAGGAAG 
GCAGAAGCAT 
CTGGAGGTTT 
TTTTTTCTCA 
TTATCTTTTC 
CACTGGAATT 
TAAATATGCT 
ATGTATTCAC 
GAAGAAAGTT 
TAAAGAATTG 



TTATATCAAA 
CCCAAACTAC 
ATGGAGGGCA 



ACTGCAGATA 
TGCATCGATG 
ACTATGAGGG 
AAGATGGCCT 
GCACAAAGAG 
CCAAAAATAA 
ATTTTGAATT 
CAAAAAGTGA 
AAGGTCTCTA 
GGATAAATAT 
ITTAAGTGAT 
TTGGAAAAGA 



CAACAACTCT 
GGAAACCATT 
GCATCATCAT 
CACTTACTCG 
TAATCAGAAT 
AAGAGGATCT 
TGACTTTTTA 



TATTGTAAAG 
ATGCTACTCA 
AAAATGTTAA 
AAGCATCTGC 
TAGTCCAACA 
AGTTTAAAAA 
AACAATGAAG 
CTACTGCACT 



TAACCATGTC 
GCACCCTGGG 
GTCTGGGAGC 
CTGAAT CAAG 
ACCTCCAGCA 
AATTTTTAAT 
TTGAATGTAT 
AAGCAGCCCA 
GGGCAAGGAG 



ACAAAAACAT 
TCTCTTATAG 
TTAGAGGCTA 



CATTTTTCTC 



TCTCCAGAGA 



AGTCTATGAG 
TAAACTTTTC 
TTGCAGTCTG 
TGCAGCTGGG 
GGAGCTAATA 
GTTTCTATTC 
CTCCTAGAGT 
GAGATTGATT 
TACAAAATTT 
GAAAGCCAGG 
GAGATTCCCT 
CAGTTTGCTT 
AAAAGAAAAA 

AGTAGGTTAT TTGTACAGTC 
AGGCCACAAG GAATATGGGT 
GGCTTGGCAC TGCCTTTTCC 
ACCTCTTCTC 
ATGATGAGTC 
CATTCATGGG 
TGAATTAAAT 
CCCTAAAATC 
AGACGGAGTC 
CTGCTCACTG AAAGCTCCGC 
TAGCTGGGAC TACAGGCGCC 
ACGGGGTTTC ACTGTGTTAG 
CTCGGCCTCC CAAAGTGCTG 
GTTTAAAGTC GTCTTCTTTT 
AATTGGATCA ATCTTGAAAT 
ACTCAGGGCA 
TTCCTGCTGT 
GCTCACTCCC 
CTAGTGCCGA TAAACTTTCT 
TAACCATCTC TTTGTTCTTT GAACATGCTG 
TTTGTAATTC 
CATATGTAGT 



„ CTGTCCflATT T 



TCTGCATCCA CAAGTTAGTA 



TTXAAAACTT 
TGACCAACAT 
GAGGGAGCTG 
CTAAGCCCCA 
CTCACTGCCC 



CAGGTTTTCC 
AATTTTAAAA 
TCATTTTAGA 



CAAACTTGAC 
TTCTTCTGAG 
: 1CTTTCT3 
ACCATCCTTC 



TGGCATTGGC 



CTGGGCACTA 



AGCIGACTGC 
ACAAACAAGG 
AGAAGGTCTA 
TCTAATTTGA 



GCTCCGATCT 
GCCTCCTGAG 
TTTAATAGAG 
ATCCGCCTGC 
CTTGTTTTCC 
TGATCATACG 
GGGAGAAAGA 
TTGCTGAAAT 



GGGAGTAAAA 
TTTCTCAGGC 
TATGGCTCAC 
TGAAGGCATT 
AATTGTTGTA 
GCCTATCTAA 
TATGTGTTTT 
TCGCTCTGAC 



ACAGAGGGAA 
AGGAAGATGC 
GCAACATCGT 
CAATGGCAAC 
CTTATTTGGA 
TGCAGGATGA 
TTCCTTCTGC 



CACCACCACG 
CCAGGATGGT 
GGATTACAGG 
AATGTAAT CA 
ACTCAACCAA 
GTCTGAGAAT 
CAGTTTTATC 
ATCAAAACCT 
CCAGTATCAC 



AGACTTAGAC 
GCACAGGCTG 
CATGCCATTC 
CCOGGCTAAT 
CTCGATCTCC 
CATGACCCAC 
TTTTGAACAT 
AAGACAGTCG 



TGACCTCGTG 



TAACGGCTAC 
GCTACCTCCC 
rCTGTT' 



G7GTGAAAGT 
AGAAGCCAGG 
GTAAGCCTAG 
TGAAACACCC 
CAAGACTTTA 
ATAAAA< 



GGTCTGCATG TATGCCC3AA 



I ATTATTATTT 



TGTAAATATA 



TTGCAATGTT 



2S80 
2640 
2700 
2760 
2620 
2680 

3000 
3060 
3120 
3180 
3240 



TAAATGCTGC 



3960 



CTTTGGGAGA 
AGGCCTTCAA 
CTGCTTCATA 
TGCCATTTGA 



GCCTGAACTG 
AGCCCTCCTT 
TATTCCTACA 
TTTTTATTGC 



TCCTGCCTCA 48 60 
4920 
4980 
S040 
5100 
5160 
5220 
5280 



CCTTATATGT GTAAGGTGAA ATTTATGGTA TTTGAGTGTG 
GCTTTCATTT TTCCCCCAGT GAATGATTTA GAATTTTTTA 
TTTCTTACTT TTATAAGGAA GCAGCTGTCT AAAATGCAGT 
TTAAACAGAG TTTTAGTATT GCTATTAAAA GAAGTTACTT 



5400 
5460 
5520 



ATAAAACAAT ATTAACTTGG TTTCTTGTTT TTGCTGTATT 
GATGATCACT TTGCAAAATT ATGCTTATGG CTGGCATGGA 
TCTTTGTTGT ATTAATGGGG AATATTTTGG ACAATGTTTC 
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AGGGGTTTTA CTTTGAGGAC CAGTGTAGTC AAGGGAAAAC ATGAGTTAAA AAGAAAAGCA 6240 

GGCAATATTG CAGTCTTGAT TCTGCCACTT ACAGGATAGA TAATGCCTGA ACT7TAATGA 6300 

CAAGATGATC CAACCATAAA GGTGCTCTGT GCTTCACAGT GAATC7TTTC CCCATGCAGG 6360 

AGTGTGCTCC CCTACAAACG TTAAGACTGA TCATTTCAAA AATCTATTAG CTATATCAAA 6420 

AGCCTTACAT TTTAATATAG GTTGAACCAA AATTTCAATT CCAGTAACTT CTATTGTAAC 6480 

CATTATTTTT GTGTATGTCT TCAAGAATGT TCATTGGATT TTTGTCTGTA ATAGTAAAAT 6540 

ACCGGATACA TTTCACGTGT CCTTCAGTAT TGATTTGGTT GAATATTGGG TCATAATGGT 6600 

TGAGAAGCAT GGACACTAGA GCCAGAATGC TTGGATATGA ATCCTGGATC TGTCACTTAC 6660 

TTCTGTGTGA CCTTTGAAAG GCTACTTATT TCCTCTCTTA GCTTTCTCAT TAAAATCAAT 6720 

GAACAATGCC AGCCTCATGG GGTTGTTGAA TGATTAAATT AGTTAATATA CCTAAAGTAC 6780 

ATAGAACACT GCCTGCACAT AGTAAAAGAA TTATAAGTGT GAGGTAGTTG GTAAAATTAT 6840 

GTAGTTGGAT ATACTACCGA ACAATATCTA ATCTCTTTTT AGGGAAATAA AGTTTGTGCA 690 0 
I CCCGAAACAT G 



65 



I I I I I I 

MAAAGPRRSV RGAVCLHLLL TLVIFSRDGE ACKKVILNVP SKLEADKIIG RVNLEECFRS 

ADLIRSSDPD FRVLNDGSVY TARAVALSDK KRSFTIWLSD KRKQTQKEVT VLLEHQKKVS 

KTRHTRETVL RRAKRRWAPI PCSMQENSLG PFPLFLQQVE SDAAQNYTVF YSISGRGVDK 

EPLNLFYIER DTGNLFCTRP VDREEYDVFD LIAYASTADG YSADLPLPLP IRVEDENDNH 

PVFTEAIYNF EVLESSRPGT TVGWCATDR DEPDTMHTRL KYSILQQTPR SPGLFSVHPS 

TGVITTVSHY LDREWDKYS LIMKVQDMDG QFFGLIGTST CIITVTDSND NAPTFRQNAY 

EAFVEENAFN VEILRIPIED KDLINTANWR VNFTILKGNE NGHFKISTDK ETNEGVLSW 

KPLNYEENRQ VNLEIGVNNE APFARDIPRV TALNRALVTV HVRDLDEGPE CTPAAQYVRI 

KENLAVGSKI NGYKAYDPEN RNGNGLRYKK LHDPKGWITI DEISGSIITS KILDREVETP 

KNELYNITVL AIDKDDRSCT GTLAVNIEDV NDNPPEILQE YWICKPKMG YTDILAVDPD 

EPVHGAPFYF SLPNTSPEIS RLWSLTKVND TAARLSYQKN AGFQEYTIPI TVKDRAGQAA 
TKLLRVWLCE CTHPTQCRAT SRSTGVILGK WAILAILLGI ALLFSVLLTL VCGVFGATKG 

KRFPEDLAQQ NLIISNTEAP GDDRVCSANG FMTQTTNNSS QGFCGTMGSG MKNGGQETIE 

MMKGGNQTLE SCRGAGHHHT LDSCRGGHTE VDKCRYTYSE WHSFTQPRLG EKLHRCNQNE 

DRMPSQDYVL TYNYEGRGSP AGSVGCCSEK QEEDGLDFLN NLEPKFITLA EACTKR 



1 11 21 31 41 51 

I I I I I I 

GGAGTGGGGG AGAGAGAGGA GACCAGGACA GCTGCTGAGA CCTCTAAGAA GTCCAGATAC 
TAAGAGCW. GATGTTTCAA ACTGGGGGCC TCATTGTCTT CTACGGGCTG TTAGCCCAGA 

45 CCATGGCCCA GTTTGGAGGC CTGCCCGTGC CCCTGGACCA GACCCTGCCC TTGAATGTGA 
ATCCAGCCCT GCCCTTGAGT CCCACAGGTC TTGCAGGAAG CTTGACAAAT GCCCTCAGCA 
ATGGCCTGCT GTCTGGGGGC CTGTTGGGCA TTCTGGAAAA CCTTCCGCTC CTGGACATCC 
TGAAGCCTGG AGGAGGTACT TCTGGTGGCC TCCTTGGGGG ACTGCTTGGA AAAGTGACGT 
CAGTGATTCC TGGCCTGAAC AACATCATTG ACATAAAGGT CACTGACCCC CAGCTGCTGG 

50 AACTTGGCCT TGTGCAGAGC CCTGATGGCC ACCGTCTCTA TGTCACCATC CCTCTCGGCA 

TAAAGCTCCA AGTGAATACG CCCCTGGTCG GTGCAAGTCT GTTGAGGCTG GCTGTGAAGC 
TGGACATCAC TGCAGAAATC TTAGCTGTGA GAGATAAGCA GGAGAGGATC CACCTGGTCC 
TTGGTGACTG CACCCATTCC CCTGGAAGCC TGCAAATTTC TCTGCTTGAT GGACTTGGCC 
CCCTCCCCAT TCAAGGTCTT CTGGACAGCC TCACAGGGAT CTTGAATAAA GTCCTGCCTG 

55 AGTTGGTTCA GGGCAACGTG TGCCCTCTGG TCAATGAGGT TCTCAGAGGC TTGGACATCA 
CCCTGGTGCA TGACATTGTT AACATGCTGA TCCACGGACT ACAGTTTGTC ATCAAGGTCT 
AAGCCTTCCA GGAAGGGGCT GGCCTCTGCT GAGCTGCTTC CCAGTGCTCA CAGATGGCTG 
GCCCATGTGC TGGAAGATGA CACAGTTGCC TTCTCTCCGA GGAACCTGCC CCCTCTCCTT 
TCCCACCAGG CGTGTGTAAC ATCCCATGTG CCTCACCTAA TAAAATGGCT CTTCTTCTGC 

60 AAAAAAAAAA AAAAAAAAAA AAAAAAAAA 



Seq r 



I I I I I I 

MFQTGGLIVF YGLLAQTMAQ FGGLPVPLDQ TLPLNVNPAL PLSPTGLAGS LTNALENGLL 

SGGLLGILEN LPLLDILKPG GGTSGGLLGG LLGKVTSVIP GLNKIIDIKV TDPQLLEU3L 

VQSPDGHRLY VTIPLGIKLQ WTPLVGASL LRLAVKLDIT AEILAVRDKQ ERIHLVLGDC 

THSPGSLQIS LLDGLGPLPI QGLLDSLTGI LNKVLPELVQ GNVCPLVNEV LRGLDITLVH 



1 11 21 31 41 SI 

CTCAGGGCAG AGGGAGGAAG GACAGCAGAC CAGACAGTCA CAGCAGCCTT GACAAAACGT 
TCCTGGAACT CAAGCTCTTC TCCACAGAGG AGGACAGAGC AG ACAG CAG A GACCATGGAG 
TCTCCCTCGG CCCCTCCCCA CAGATGGTGC ATCCCCTGGC AGAGGCTCCT GCTCACAGCC 
TCACTTCTAA CCTTCTGGAA CCCGCCCACC ACTGCCAA3C TCAC7ATTGA ATCCACGCCG 
TTCAATGTCG CAGAGGGGAA GGAGGTGCTT CTACTTGTCC ACAATCTGCC CCAGCATCTT 
TTTGGCTACA GCTGGTACAA AGGTGAAAGA GTGGATGGCA ACCGTCAAAT TATAGGATAT 
GTAATAGGAA CTCAACAAGC TACCCCAGGG CCCGCATACA GTGGTCGAGA GATAATATAC 
CCCAATGCAT CCCTGCTGAT CCAGAACATC ATCCAGAATG ACACAGGATT CTACACCCTA 
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CACGTCATAA AGTCAGATCT TGTGAATGAA GAAGCAACTG GCCAGTTCCG GGTATACCCG 540 

GAGCTGCCCA AGCCCTCCAT CTCCAGCAAC AACTCCAAAC CCGTGGAGGA CAAGGATGCT 600 

GTGGCCTTCA CCTGTGAACC TGAGACTCAG GACGCAACCT ACCTGTGGTG GGTAAACAAT 660 

CAGAGCCTCC CGGTCAGTCC CAGGCTGCAG CTGTCCAATG GCAACAGGAC CCTCACTCTA 720 

TTCAATGTCA CAAGAAATGA CACAGCAAGC TACAAATGTG AAACCCAGAA CCCAGTGAGT 780 

GCCAGGCGCA GTGATTCAGT CATCCTGAAT GTCCTCTATG GCCCGGATGC CCCCACCATT 840 

TCCCCTCTAA ACACATCTTA CAGATCAGGG GAAAATCTGA ACCTCTCCTG CCACGCAGCC 900 

TCTAACCCAC CTGCACAGTA CTCTTGGTTT GTCAATGGGA CTTTCCA3CA ATCCACCCAA 960 

GAGCTCTTTA TCCCCAACAT CACTGTGAAT AATAGTGGAT CCTATACGTG CCAAGCCCAT 1020 

AACTCAGACA CTGGCCTCAA TAGGACCACA GTCACGACGA TCACAGTCTA TGCAGAGCCA 1080 

CCCAAACCCT TCATCACCAG CAACAACTCC AACCCCGTGG AGGATGAGGA TGCTGTAGCC 1140 

TTAACCTGTG AACCTGAGAT TCAGAACACA ACCTACCTGT GGTGGGTAAA TAATCAGAGC 1200 

CTCCCGGTCA GTCCCAGGCT GCAGCTGTCC AATGACAACA GGACCCTCAC TCTACTCAGT 1260 

GTCACAAGGA ATGATGTAGG ACCCTATGAG TGTGGAATCC AGAACGAATT AAGTGTTGAC 1320 

CACAGCGACC CAGTCATCCT GAATGTCCTC TATGGCCCAG ACGACCCCAC CATTTCCCCC 1380 

TCATACACCT ATTACCGTCC AGG ( GGTGAAC CTCAGCCTCT CCTGCCATGC AGCCTCTAAC 1440 

CCACCTGCAC AGTATTCTTG GCTGATTGAT GGGAACATCC AGCAACACAC ACAAGAGCTC 1500 

TTTATCTCCA ACATCACTGA GAAGAACAGC GGACTCTATA CCTGCCAGGC CAATAACTCA 1560 

GCCAGTGGCC ACAGCAGGAC TACAGTCAAG ACAATCACAG TCTCTGCGGA GCTGCCCAAG 1620 

CCCTCCATCT CCAGCAACAA CTCCAAACCC GTGGAGGACA AGGATGCTG7 GGCCTTCACC 1680 

TGTGAACCTG AGGCT C AG AA CACAACCTAC CTGTGGTGGG TAAATGGTCA GAGCCTCCCA 1740 

GTCAGTCCCA GGCTGCAGCT GTCCAATGGC AACAGGACCC TCACTCTATT CAATGTCACA 1800 

AGAAATGACG CAAGAGCCTA TGTATGTGGA AT CC AGAACT CAGTGAGTGC AAACCGCAGT 1860 

GACCCAGTCA CCCTGGATGT CCTCTATGGG CCGGACACCC CCATCATTTC CCCCCCAGAC 1920 

TCGTCTTACC TTTCGGGAGC GAACCTCAAC CTCTCCTGCC ACTCGGCCTC TAACCCATCC 1980 

CCGCAGTATT CTTGGCGTAT CAATGGGATA CCGCAGCAAC ACACACAAGT TCTCTTTATC 2040 

GCCAAAATCA CGCCAAATAA TAACGGGACC TATGCCTGTT TTGTC7CTAA CTTGGCTACT 2100 

GGCCGCAATA ATTCCATAGT CAAGAGCATC ACAGTCTCTG CATCTGGAAC TTCTCCTGGT 2160 

CTCTCAGCTG GGGCCACTGT CGGCATCATG ATTGGAGTGC TGGTTGGGG- TGC-CTGATA 2220 

TAGCAGCCCT GGTGTAGTTT CTTCATTTCA GGAAGACTGA CAGTTGTTTT GCTTCTTCCT 2280 

TAAAGCATTT GCAACAGCTA CAGTCTAAAA TTGCTTCTTT ACCAAGGATA TTTACAGAAA 2340 

AGACTCTGAC CAGAGATCGA GACCATCCTA GCCAACATCG TGAAACCCCA TCTCTACTAA 2400 

AAATACAAAA ATGAGCTGGG CTTGGTGGCG CGCACCTGTA GTCCCAGTTA CTCGGGAGGC 2460 

TGAGGCAGGA GAATCGCTTG AACCCGGGAG GTGGAGATTG CAGTGAGCCC AGATCGCACC 2520 

ACTGCACTCC AGTCTGGCAA CAGAGCAAGA CTCCATCTCA AAAAGAAAAG AAAAGAAGAC 2580 

TCTGACCTGT ACTCTTGAAT ACAAGTTTCT GATACCACTG CACTGTCTGA GAATTTCCAA 2640 

AACTTTAATG AACTAACTGA CAGCTTCATG AAACTGTCCA CCAAGATCAA GCAGAGAAAA 2700 

TAATTAATTT CATGGGACTA AATGAACTAA TGAGGATTGC TGATTCTTTA AATGTCTTGT 2760 

TTCCCAGATT TCAGGAAACT TTTTTTCTTT TAAGCTATCC ACTCTTACAG CAATTTGATA 2 820 

AAATATACTT TTGTGAACAA AAATTGAGAC ATTTACATTT TCTCCCTATG TGGTCGCTCC 2880 

AGACTTGGGA AACTATTCAT GAATATTTAT ATTGTATGGT AATATAGTTA TTGCACAAGT 2 940 
TCAATAAAAA TCTGCTCTTT GTATAACAGA AAAA 



1 11 21 31 41 51 

I I I I I I 

MESPSAPPHR WCIPWQRLLL TASLLTFWNP PTTAI LTIES F! "IE /LLLVHNLPQ 
IGTQQAT PGPAYSGREI IYPNASLLIQ NIIQNDTGFY 
TLHVIKSDLV NEEATGQFRV YPELPKPSIS SNNSKPVEDK DAVAFTCEPE TQDATYLWWV 
NNQSLPVSPR LQLSNGNRTL TLFNVTRNDT ASYKCETQNP VSARRSDSVI LN\'LYGPDAP 
TISPLNTSYR SGENLNLSCH AASNPPAQYS WFVNGTFQQS TQELFIPNIT VNNSGSYTCQ 
AHNSDTGLNR TTVTTITVYA EPPKPFITSN NSNPVEDEDA VALTCEPEIQ NTTYLWWVNN 
QSLPVSPRLQ LSNDNRTLTL LSVTRNDVGP YECGIQNELS VDHSDPVILN VLYGPDDPTI 
SPSYTYYRPG VNLSLSCHAA SNPPAQYSWL IDGNIQQHTQ ELFISNITEK NSGLYTCQAN 
NSASGHSRTT VKTITVSAEL PKPSISSNNS KPVEDKDAVA FTCEPEAQNT TYLWWVMGQS 
LPVSPRLQLS NGNRTLTLFN VTRNDARAYV CGIQNSVSAN RSDPVTLDVL YGPDTPIISP 
PDSSYLSGAN LNLSCHSASN PSPQYSWRIN GIPQQHTQVL FIAKITPNHN GTYACFVSNL 
ATGRNNSIVK SITVSASGTS PGLSAGATVG IMIGVLVGVA LI 

Seq ID NO: 534 DNA sequence 

Nucleic Acid Accession #: NM_006952.1 

Coding sequence: 11.. 793 

i i 1 i 1 i 1 i 1 T 

AATCCCGACA ATGGCGAAAG ACAACTCAAC TGTTCGTTGC TTCCAGGGCC TGCTGAT7TT 
TGGAAATGTG ATTATTGGTT GTTGCGGCAT TGCCCTGACT GCGGAGTGCA TCTTCTTTGT 
ATCTGACCAA CACAGCCTCT ACCCACTGCT TGAAGCCACC GACAACGATG ACATCTATGG 
GGCTGCCTGG ATCGGCATAT TTGTGGGCAT CTGCCTCTTC TGCCTGTCTG TTCTAC-GCAT 
TGTAGGCATC ATGAAGTCCA GCAGGAAAAT TCTTCTGGCG 7ATTTCATTC TGATGTTTAT 
AGTATATGCC TTTGAAGTGG CATCTTGTAT CACAGCAGCA ACACAACGAG ACTTTTTCAC 
ACCCAACCTC TTCCTGAAGC AGATGCTAGA GAGGTACCAA AACAACAGCC CTCCAAACAA 
TGATGACCAG TGGAAAAACA ATGGAGTCAC CAAAACCTGG GACAGGCTCA TGCTCCAGGA 
CAATTGCTGT GGCGTAAATG GTCCATCAGA CTGGCAAAAA TACACATCTG CCTTCCGGAC 
TGAGAATAAT GATGCTGACT ATCCCTGGCC TCGTCAATGC TGTGTTATGA ACAATCTTAA 
AGAACCTCTC AACCTGGAGG CTTGTAAACT AGGCGTGCCT GGTTTTTATC ACAATCAGGG 
CTGCTATGAA CTGATCTCTG GTCCAATGAA CCGACACGCC TGGGGGGTTG CC7GGTTTGG 
T CTCTGCTGGA CTTTTTGGGT TCTCCTGGGT ACCATGTTCT ACTGGAGCAG 



390 
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MAKDNSTVRC FQGLLIFGNV IIGCCGIALT AECIFFVSDQ HSLYPLLEAT DNDDIYGAAW 
IGIFVGICLF CLSVLGIVGI MKSSRKILLA YFILHFIVYA FEVASCITAA TQRDFFTPN^ 
FLKQMLERYQ NNSPPNNDDQ WKNNGVTKTW DRLMLQDNCC GVNGPSDWQK YTSAFRTENN 
DADYPWPRQC CVMNNLKEPL NLEACKLGVP GFYHNQGCYE LISGPMHRHA W3VAWFGFAI 
G TMFYWSRIEY 



sequence 



PCT/US02/12476 



GCTGGACTGC 



AGGCAGCTGT 
TCAATGGACA 
CGCAAGAGCC 



ATAAAGATTG 
CAGCTTCTTG 
CACGGGAGTT 
AGATCCCGTT 
AGTCAAAGGT 



TCAAGAAGTG 
CGGTCCTTGC 
TGCTGCCCTT 
GAGCTGCCTC 



CTGTGAAGGC 
TGCACCTGTG 
CCCCTTCCCA 
TCTCATCCAC 



CCCTTGTAAA TACCACAGAC CCGCCCTGGA GCCAGGCCAA 
GTATGGCCTT AGCTCTTAGC CAAACACCTT CCTGACACCA 
ATCGTGGTGG TGTTCCTCAT CGCTGGGACG CTGGTTCTAG 
GTCAAGACAC TGTCAAAGGC CGTGTTCCAT 
TTTCAGTTAA AGGTCAAGAT AAAGTCAAAG 
CCAGTCTCCA CTAAGCCTGG CTCCTGCCCC ATTATCTTGA 
CCCCCTAACC GCTGCTTGAA AGATACTGAC TGCCCAC-GAA 
TCTTGCGGGA TGGCCTGTTT CGTTCCCCAG TGAAGGGAGC 
CCGTCCCCAG AGCTACAGGC CCCATCTGGT CCTAAGTCCC 
CACTGTCCAT TCTTCCTCCC ATTCAGGATG CCCACGGCTG 
TTTCCAATAA A 



Protein Accession ft 



Seq ID NO: 533 DNA sequence 

Nucleic Acid Accession #: NM_001793.2 

Coding sequence: 71.. 2560 



I 



41 



CTCTGCAGCC ATGGGGCTCC 



Z CCGCCGTCGC GGCAGCTGCT TCACCCCTCT 



G GAACACCGGC 

C TCTCGCGTCT CTCCTCCTTC 
S CCGGGCGGTC TTCAGGGAGG 
CTTGGAGGCC GGAGGCGCGG AGCAGGAGCC CGGCCAGGCG CTGGGGAAAG 
G CTCTGTTTAG CACTGATAAT GATGACTTCA 
A GAAGGTCACT GAAGGAAAGG AATCCATTGA 
A GACACAAGAG AGATTGGGTG GTTGCTCCAA 
r TCCCCCAGAG ACTGAATCAG CTCAAGTCTA 



1 5G 'GAGA A 
ATCCAAACGT 
TGAAAATGGC 
AGACACCAAG 
CTTCGCTGTA 
GATTGCCAAG 
CCCCATGAAC 
GGACACCTTC 



G CAAGAGCCAG C 



ATCTTACGAA 
AAGGGTCCCT 
ATTTTCTACA 



TATGAGCTCT 



CATCCAGGCC 



GCCTGAGAAT 
CAACTCACCA 
TACCATCACC 
TTTTGAGGCC 
GCTGAAGCTC 
ACCTGTGTTT 
GCCTGTGTGT 
CATCCTGAGA 
TGTGGGCACC 
GGTCTTGGCC 
ACTGATTGAT 
CCAAAGCCCT 
CCCTTTCCAG 



CGAGGGAGTG 
GATGAGGATG 
GAACCAAAGG 
GTCATCTCCA 
ACAGACATGG 
GCCAATGACA 
GCAGTGGGCC 



CAGGCTGGTT GTTGTTGAAT AAGCCACTGG 
TTGGCCACGC TGTGTCAGAG AATGGTGCCT 
TCGTGACCGA CCAGAATGAC CACAAGCCCA 
TCTTAGAGGG AGTCCTACCA GGTACTTCTG 
ATGCCATCTA CACCTACAAT GGGGTGGTTG 
ACCCACACGA CCTCATGTTC 



CTGAAGTGAC 
TATTCATGGG 
CTG7GCGGAA 
AGATCTTCCC 
TATCTGTCCC 
ATAAAGATAG 
CTGAGGG7GT 
ACCGGGAGGA 
CAGTGGAGGA 
AGTTTACCCA 
TGATGCAGGT 
CTTACTCCAT 



GTCCCTGAGT ACACACTGAC 



ACCCACCCTG 
AAAAACCAGC 
CCAACCTCCA 
GTCCCACCCT 
GTCTACACTG 
GACCCAGCAG 
CTCGACCGTG 
ATGGACAATG 



ATGGGGACGG 
ATGCTCCCAT 
ATGAGGTGCA 
CCACCTACCT 
AG AG C AACCA 
ACACCCTGTA 
CAGCCACCAT 
CCAAAGTCGT 
CAGAAGACCC 
GGTGGCTAGC 



TATCATGGGC 



AGTGGTCCAC 
TGAGGTCCAG 
TGACAAGGAG 



GCACCTTTCT 
GTGCGACTGC 
CCCTGTGCTG 
GAGAAAGAAG 



GCCCAGCTCA 
ACAGTGGTCT 
CTGTCTGACC 
CATGGCCATG 
GGGGCTGTCC 
CGGAAGATCA 
TATGGCGAAG 
GGTCTGGAGG 
ACACCCATGT 



GAAGCCCTCC 
ATGGCCCAGT 
TGCTGAACAT 
CAGATGACTC 
TGTCCCTGAA 
ATGGCAACAA 
TCGAAACCTG 



AGGAGCCCCT 



CTTGGTGTTC 
CTCCGCCTCC 
GAAGCTGGCA 
GGGACCAAAC 
GACTTCGGAG 



GACTATGAGG 
GACCAAG AC C 
GACATGTACG 
GTCAGGCCAC 
CTTGTCAGGA 
GGTTGCTTCC 
ACCTCTCCAC 
CGTAAAATGC 
TTTCTCTCTC 



CCAGGCCGGA 
ACCGTCCTCG 
CGGCTAACAC 
GCAGCGGCTC 
AAGATTACGA 



GTTTGTGAGG 
CACCACTGGC 
CCCTGAGCCC 
CACGGACAAG 
AGACATCTAC 
GAAGTTCCTG 
AGAGCAGCTG 
CCCTGGACCC 
GTTCCTCCTG 
CCTACTCCCA 
CGAAGAGGAC 
GGTGGTTCTC 
GCCAGCCAAC 



AGGCCCATGT 
TGGACGCCCC 
GGTGACGACG GGGACCATTT 
ACAACCAGGA 
ACCAACGAGG 
GTGGAGGATG TGAATGAGGC 
CCACTGGGGA 
TCAGCTACCG 
AGGTCACAGC 
ATGAAGTCAT 
TTCTGCTAAC 
CCATCTGCAA 
CCCACACCTC 
AGGTCAACGA 
AAGCAGGATA CATATGACGT 



AATCAAAAGA 
GACAGTGGGC 
AACAACATCT 
ACGGGAACCC 



GACCTGTCTC 



TGGAAGGGAG 
CTGGTGCTGC 
GAAGATGACA 
CAGGACTATG 



AGAGCATCTC 
AGTGGCCGTA 
TTAGCCTTTC 
CTGGGCCAGG 
TCAACCCTGT 
GAATGGAACC 



CGACGCCGCG 
TTATCTGAAC 
GGACGACTAG 
CAAGGGGTCT 
GCAACTTGGC 
AGGATGGAGG 
GTTGCCTCAG 



CCCGTGACAA 
ACATCACCCA 
TGGCACCAAC 
TCGGCAACTT 
ACGACACCCT 



GAGTGGGGCA 

CAGTTCCCCC 
GGAGACAGGC 
AAIGTGGGCA 

TGGGCCTGCT 



GCCGCTTCAA 2 520 

TGCAGGGCTG 2 580 

TTCAGCTGAG 2S40 

TATGAGTCTG 2 700 

GTTTGACTTC 2 760 
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TTTTTTTAAT GCTATCTTCA AAACGTTAGA GAAAGTTCTT CAAAAGTGCA GCCCAGAGCT 3 000 

GCTGGGCCCA CTGGCCGTCC TGCATTTCTG GTTTCCAGAC CCCAATGCCT CCCATTCGGA 30S0 

TGGATCTCTG CGTTTTTATA CTGAGTGTGC CTAGGTTGCC CCTTATTTTT TATTTTCCCT 3120 

GTTGCGTTGC TATAGATGAA GGGTGAGGAC AATCGTGTAT ATGTACTAGA ACTTTTTTAT 3180 
5 TAAAGAAACT TTTCCCAGAA AAAAA 

10 i 

I I I I I I 

MGLPRGPLAS LLLLQVCWLQ CAASEPCRAV FREAEVTLEA GGASQEPGQA LGKVFMGCPG SO 

QEPALFSTDN DDFTVRNGET VQERRSLKER NPLKIFPSKR ILRRHKRDWV VAPISVPENG 120 

KGPFPQRLNQ LKSNKDRDTK IFYSITGPGA DSPPEGVFAV EKETGWLLLN KPLDREEIAK 180 

15 YELFGHAVSE NGASVEDPMN ISIIVTDQND HKPKFTQDTF RGSVLEGVLP GTSVMQVTAT 240 

DEDDAIYTYN GWAYSIHSQ EPKDPHDLMF TIHRSTGTIS VIS3GLDREK VPEYTLTIQA 300 

TDMDGDGSTT TAVAWEILD ANDNAPMFDP QKYEAHVPEN AVGHFA'QRLT VTDLDAPNSP 3 SO 

AWRATYLIMG GDDGDHFTIT THPESNQGIL TTRKGLDFEA KNQHTLYVEV TNEAPFVLKL 420 

PTSTATIWH VEDVHEAPVF VPPSKWEVQ EGIPTGEPVC VYTAEDPDKE NQKISYRILR 480 

20 DPAGWLAMDP DSGQVTAVGT LDREDEQFVR NNI YEVMVLA MDNGSPPTIG TGTIiLTLID 540 

VNDHGPVPEP RQITICNQSP VRQVLNITDK DLSPHTSPFQ AQLTEDSDIY WTAEVNEEGD 600 

TWLSLKKFL KQDTYDVHLS LSDHGNKEQL TVIRATVCDC HGHVETCPGP WKGGFILPVL 6S0 

GAVLALLFLL LVLLLLVRKK RKIKEPLLLP EDDTRDNVFY YGEEGGGEED QDYDITQLHR 720 

GLEARPEWL RNDVAPTIIP TPMYRPRPAN PDEIGNFIIS NLKAANTDPT APFYDTLLVF 780 
25 DYEGSGSDAA SLSSLTSSAS DQDODYDYLN EWGSRFKKLA DMYGGGEDD 



1 11 21 31 41 51 

I 1 I I I I 

ATGAGGCTCC AAAGACCCCG ACAGGCCCCG GCGGGTGGGA GGCGCGCGCC CCGGGGCGGG 

CGGGGCTCCC CCTACCGGCC AGACCCGGGG AGAGGCGCGC GGAGGCTGCG AAGGTTCCAG 

35 AAGGGCGGGG AGGGGGCGCC GCGCGCTGAC CCTCCCTGGG CACQGCTGGG GACGATGGCG 

CTGCTCGCCT TGCTGCTGGT CGTGGCCCTA CCGCGGGTGT GGACAGACGC CAACCTGACT 

GCGAGACAAC GAGATCCAGA GGACTCCCAG CGAACGGACG AGGGTGACAA TAGAGTGTGG 

TGTCATGTTT GTGAGAGAGA AAACACTTTC GAGTGCCAGA ACCCAAGGAG GTGCAAATGG 

ACAGAGCCAT ACTGCGTTAT AGCGGCCGTG AAAATATTTC CACGTTTTTT CATGGTTGCG 

40 AAGCAGTGCT CCGCTGGTTG TGCAGCGATG GAGAGACCCA AGCCAGAGGA GAAGCGGTTT 

CTCCTGGAAG AGCCCATGCC CTTCTTTTAC CTCAAGTGTT GTAAAATTCG CTACTGCAAT 

TTAGAGGGGC CACCTATCAA CTCATCAGTG TTCAAAGAAT ATGCTGGGAG CATGGGTGAG 

AGCTGTGGTG GGCTGTGGCT GGCCATCCTC CTGCTGCTGG CCTCCATTGC AGCCGGCCTC 
. , AGCCTGTCTT GA 



AGGRRAPRGG RGSPYRPDPG RGARRLRRFQ KGGEGAPRAD PFWAPLGTMA 
PRVWTDANLT ARQRDPEDSQ RTDEGDNRVW CHVCERENTF SCONPRRCXW 
FPRFFMVA KQCSAGCAAM ERPKPEEKRF LLEEPMPFFY LKCCKIRYCN 
, r ^urruos. FKEYAGSMGE SCGGLWIAIL LLLASIAAGL SLS 

ft sequence 

Coding sequence: 53..1S7S 

60 i 

I I I I I I 

GCTCGCTGGG CCGCGGCTCC CGGGTGTCCC AGGCCCGGCC GGTGCGCAGA GCATGGCGGG 60 

TGCGGGCCCG AAGCGGCGCG CGCTAGCGGC GCCGGCGGCC GAGGAGAAGG AAGAGGCGCG 120 

GGAGAAGATG CTGGCCGCCA AGAGCGCGGA CGGCTCGGCG CCGGCAGGCG AGGGCGAGGG 180 

65 CGTGACCCTG CAGCGGAACA TCACGCTGCT CAACGGCGTG GCCATCATCG TGGGGACCAT 240 

TATCGGCTCG GGCATCTTCG TGACGCCCAC GGGCGTGCTC AAGGAGGCAG GCTCGCCGGG 300 

GCTGGCGCTG GTGGTGTGGG CCGCGTGCGG CGTCTTCTCC ATCGTGGGCG CGCTCTGCTA 360 

CGCGGAGCTC GGCACCACCA TCTCCAAATC GGGCGGCGAC TACGCCTACA TGCTGGAGGT 420 

CTACGGCTCG CTGCCCGCCT TCCTCAAGCT CTGGATCGAG CTGCTCATCA TCCGGCCTTC 480 

70 ATCGCAGTAC ATCGTGGCCC TGGTCTTCGC CACCTACCTG CTCAAGCCGC TCTTCCCCAC 540 

CTGCCCGGTG CCCGAGGAGG CAGCCAAGCT CGTGGCCTGC CTCTGCGTGC TGCTGCTCAC S00 

GGCCGTGAAC TGCTACAGCG TGAAGGCCGC CACCCGGGTC CAGGATGCCT TTGCCGCCGC S60 

CAAGCTCCTG GCCCTGGCCC TGATCATCCT GCTGGGCTTC GTCCAGATCG GAAAGGGTGA 720 

TGTGTCCAAT CTAGATCCCA ACTTCTCATT TGAAGGCACC AAACTGGATG TGGGGAACAT 780 

75 TGTGCTGGCA TTATACAGCG GCCTCTTTGC CTATGGAGGA TGGAATTACT TGAATTTCGT 840 

CACAGAGGAA ATGATCAACC CCTACAGAAA CCTGCCCCTG GCCATCATCA TCTCCCTGCC 900 

CATCGTGACG CTGGTGTACG TGCTGACCAA CCTGGCCTAC TTCACCACCC TGTCCACCGA 960 

GCAGATGCTG TCGTCCGAGG CCGTGGCCGT GGACTTCGG3 AACTATCACC TGGGCGTCAI 1020 

GTCCTGGATC ATCCCCGTCT TCGTGGGCCT GTCCTGCTTC GGCTCCGTCA ATGGGTCCCI 1080 

80 GTTCACATCC TCCAGGCTCT TCTTCGTGGG GTCCCGGGAA GGCCACCTGC CCTCCATCCI 1140 

CTCCATGATC CACCCACAGC TCCTCACCCC CGTGCCGTCC CTCGTGTTCA CGTGTGTGAT 12 00 

GACGCTGCTC TACGCCTTCT CCAAGGACAT CTTCTCCGTC ATCAACTTCT TCAGCTTCTT 1260 

CAACTGGCTC TGCGTGGCCC TGGCCATCAT CGGCATGATC TGGCTGCGCC ACAGAAAGCC 132 0 

TGAGCTTGAG CGGCCCATCA AGGTGAACCT GGCCCTGCCT GTGTTCTTCA TCCTGGCCTG 1380 

85 CCTCTTCCTG ATCGCCGTCT CCTTCTGGAA GACACCCGTG GAGTGTGGCA TCGGCTTCAC 144 0 

CATCATCCTC AGCGGGCTGC CCGTCTACTT CTTCGGGGTC TGGTGGAAAA ACAAGCCCAA 150 0 

GTGGCTCCTC CAGGGCATCT TCTCCACGAC CGTCCTGTGT CAGAAGCTCA TGCAGGTGGI 1560 
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MAGAGPKRRA LAAPAAEEKB EAREKMLAAK SADGSAPAGE GEGVTLQHNI 1 
GTIIGSGIFV TPTGVLKEAG SPGIALWWA ACGVFSIVGA LCYAELGTTI SKSGGDYAYM 
LEVYGSLPAP LKLWIELLII RPSSQYIVAL VPATYLLKPL PPTCPVE 
LLTAVKCYSV KAATRVQDAF AAAKLLALAL IILLGFVQIG KGDVSNLDPN F 
GNIVLALYSG LFAYGGWNYL NFVTEEMINP YRNLPLAI II SLPIVTLVYV LTNLAYFTTL 300 
STEQMLSSEA VAVDFGNYHL GVMSWIIPVP VGLSCPGSVN GSLFTSSRLF FVGSREGHLP 360 
SILSMIHPQL LTPVPSLVFT CVMTLLYAFS KDIFSVINFF SFFNWLCVAL AIIGMIWLRH 420 
RKPELERPIK VNLALPVFFI LACLFLIAVS FWKTPVECGI GFTIILSGLP VYFFGVWWKN 480 
KPKWLLQGIF STTVLCQKLM QWPQET 



20 Coding sequence: 168.. 989 

1 11 21 31 41 SI 

I I I I I 

TAAAAAGCAA AAGAATTCGC GGCCGCGTCG ACACGGGCTT CCCCGAAAAC CTTCCCCGCT 60 

25 TCTGGATATG AAATTCAAGC TGCTTGCTGA GTCCTATTGC CGGCTGCTGG GAGCCAGGAG 12 0 

AGCCCTGAGG AGTAGTCACT CAGTAGCAGC TGACGCGTGG GTCCACCATG AACTGGAGTA 180 

TCTTTGAGGG ACTCCTGAGT GGGGTCAACA AGTACTCCAC AGCCTTTGGG CGCATCTGGC 240 

TGTCTCTGGT CTTCATCTTC CGCGTGCTGG TGTACCTGGT GACGGCCGAG CGTGIGTGGA 300 

GTGATGACCA CAAGGACTTC GACTGCAATA CTCGCCAGCC CGGCTGCTCC AACGTCTGCT 360 

30 TTGATGAGTT CTTCCCTGTG TCCCATGTGC GCCTCTGGGC CCTGCAGCTT ATCCTGGTGA 42 0 

CATGCCCCTC ACTGCTCGTG GTCATGCACG TGGCCTACCG GGAGGTTCAG GAGAAGAGGC 480 

ACCGAGAAGC CCATGGGGAG AACAGTGGGC GCCTCTACCT GAACCCCGGC AAGAAGCGGG 54 0 

GTGGGCTCTG GTGGACATAT GTCTGCAGCC TAGTGTTCAA GGCGAGCGTG GACATCGCCT 600 

TTCTCTATGT GTTCCACTCA TTCTACCCCA AATATATCCT CCCTCCTGTG GTCAAGTGCC 660 

35 ACGCAGATCC ATGTCCCAAT ATAGTGGACT GCTTCATCTC CAAGCCCTCA GAGAAGAACA 72 0 

TTTTCACCCT CTTCATGGTG GCCACAGCTG CCATCTGCAT CCTGCTCAAC CTCGTGGAGC 780 

TCATCTACCT GGTGAGCAAG AGATGCCACG AGTGCCTGGC AGCAAGGAAA GCTCAAGCCA 840 

TGTGCACAGG TCATCACCCC CACGGTACCA CCTCTTCCTG CAAACAAGAC GACCTCCTTT 90 0 

CGGGTGACCT CATCTTTCTG GGCTCAGACA GTCATCCTCC TCTCTTACCA GACCGCCCCC 960 

40 GAGACCATGT GAAGAAAACC ATCTTGTGAG GGGCTGCCTG GACTGGTCTG GCAGGTTGGG 102 0 

CCTGGATGGG GAGGCTCTAG CATCTCTCAT AGGTGCAACC TGAGAGTGGG GGAGCTAAGC 1080 

CATGAGGTAG GGGCAGGCAA GAGAGAGGAT TCAGACGCTC TGGGAGCCAG TTCCTAGTCC 1140 

TCAACTCCAG CCACCTGCCC CAGCTCGACG GCACTGGGCC AGTTCCCCCT CTGCTCTGCA 1200 
GCTCGGTTTC CTTTTCTAGA A 

45 



I I I I I 

MNWSIFEGLL SGVHKYSTAF GRIWLSLVFI FRVLVYLVTA ERVWSDDHKD FDCNTRQPGC 
SNVCFDEFFP VSHVRLWALQ LILVTCPSLL WMHVAYREV QSKRHREAIIG ENSGRLYLNP 
,r RGGLWWT YVCSLVFKAS VDIAFLYVFH SFYPKYILPP WKCHADPCP NIVDCFISKP 
SEKNIFTLFM VATAAICILL NLVELIYLVS KRCHECLAAR KAQAMCTGHH PHGTTSSCKQ 
DDLLSGDLIF LGSDSHPPLL PDRPRDHVKK TIL 



Coding sequence: 26.. 457 

1 11 21 31 41 51 

I I I I I 

CGGGCGAAG C AGCGCGGGCA GCGAGATGCA GCACCGAGGC TTCCTCCTCC TCACCCTCCT 
CGCCCTGCTG GCGCTCACC1 CCGCGG1CGC CAAAAAGAAA GATAAGGTGA AGAAGGGCGG 
CCCGGGGAGC GAGTGCGCTG AGTGGGCCTG GGGGCCCTGC ACCCCCAGCA GCAAGGATTG 
CGGCGTGGGT TTCCGCGAGG GCACCTGCGG GGCCCAGACC CAGCGCATCC GGTGCAGGGT 
GCCCTGCAAC TGGAAGAAGG AGTTTGGAGC CGACTGCAAG TACAAGTTTG AGAACTGGGG 
TGCGTGTGAT GGGGGCACAG GCACCAAAGT CCGCCAAGGC ACCCTGAAGA AGGCGCGCTA 
CAATGCTCAG TGCCAGGAGA CCATCCGCGT CACCAAGCCC TGCACCCCCA AGACCAAAG C 
AAAGGCCAAA GCCAAGAAAG GGAAGGGAAA GGACTAGACG CCAAGCCTGG ATGCCAAGGA 
GCCCCTGGTG T CACATGGGG CCTGGCCACG CCCTCCCTCT CCCAGGCCCG AGATGTGACC 
CACCAGTGCC TTCTGTCTGC TCGTTAGCTT TAATCAATCA TGCCCTGCCT TGTCCCTCTC 
ACTCCCCAGC CCCACCCCTA AGTGCCCAAA GTGGGGAGGG ACAAGGGATT CTGGGAAGCT 
TGAGCCTCCC CCAAAGCAAT GTGAGTCCCA GAGCCCGCTT TTGTTCTTCC CCACAATTCC 
ATTACTAAGA AACACATCAA ATAAACTGAC TTTTTCCCCC CAATAAAAGC TCTTCTTTTT 
TAATAT 



I I I I I I 

MQHRGFIiLLT LLALLALTSA VAKKKDKVKK GGPGSECAEW AWGPCTESSK DCGVGFRSGT 
CGAQTQRIRC RVPCNWKKEF GADCKYKFEN WGACDGGTGT KVRQGTLKKA RYNAQCQSTI 
I KAKAKAKKGK GKD 



Seq ID NO: 548 DNA sequence 
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Nucleic Acid Accession #: NM_006783.1 
Coding sequence: 1..786 

. 1 11 21 31 41 51 

ATGGATTGGG GGACGCTGCA CACTTTCATC GGGGGTGTCA ACAAACACTC CACCAGCATC 60 

GGGAAGGTGT GGATCACAGT CATCTTTATT TTCCGAGTCA TGATCCTAGT GGTGGCTGCC 12 0 

CAGGAAGTGT GGGGTGACGA GCAAGAGGAC TTCGTCTGCA ACACACTGCA ACCGGGATGC 180 

AAAAATGTGT GCTATGACCA CTTTTTCCCG GTGTCCCACA TCCGGCTGTG GGCCCTCCAG 24 0 

CTGATCTTCG TCTCCACCCC AGCGCTGCTG GTGGCCATGC ATGTGGCCTA CTACAGGCAC 300 

Z GCAAGTTCAG GCGAGGAGAG AAGAGGAATG ATTTCAAAGA CATAGAGGAC 360 

W3GTTCG GATAGAGGGG TCGCTGTGGT GGACGTACAC CAGCAGCATC 42 0 

\ TCATCTTTGA AGCAGCCTTT ATGTATGTGT TTTACTTCCT TTACAATGGG 480 

TACCACCTGC CCTGGGTGTT GAAATGTGGG ATTGACCCCT GCCCCAACCT TGTTGACTGC 540 

TTTATTTCTA GGCCAACAGA GAAGACCGTG TTTACCATTT TTATGATTTC IGCGTCTGTG 600 

ATTTGCATGC TGCTTAACGT GGCAGAGTTG TGCTACCTGC TGCTGAAAGT GTGTTTTAGG 660 

AGAT C AAAGA GAGCACAGAC GCAAAAAAAT CACCCCAATC ATGCCCTAAA GGAGAGTAAG 72 0 

CAGAATGAAA TGAATGAGCT GATTTCAGAT AGTGGTCAAA ATGCAATCAC AGGTTTCCCA 780 



1 11 21 31 41 51 

25 | | | 1 | I 

MDWGTLHTFI GGVMKHSTSI GKVWITVIFI PRVMILWAA QEVKGDEQED FVCNTLQP3C 

KNVCYDHFFP VSHIRLWALQ. LIFVSTPALL VAMHVAYYRH ETTRKFRRGE KRNDFKDISD 

IKKHKVRIEG SLWWTYTSSI FFRI IFEAAF MYVFYFLYNG YHLPWVLKCG IDPCPNLVDC 

FI SRPTEKTV FTIFMISASV ICMLLHVAEL CYLLLKVCFR RSKRAQTQKN HPNHALKESK 
30 QNEMNELISD SGQNAITGFP S 

Seq ID NO: 550 DNA sequence 
Nucleic Acid Accession #: NM_002571.1 
^ Coding sequence: 99.. 587 

* 1 11 21 31 41 51 

I I I I I I 

CATCCCTCTG GCTCCAGAGC TCAGAGCCAC CCACAGCCGC AGCCATGCTG TGCCTCCTGC 
TCACCCTGGG CGTGGCCCTG GTCTGTGGTG TCCCGGCCAT GGACATCCCC CAGACCAAGC 
40 AGGACCTGGA GCTCCCAAAG TTGGCAGGGA CCTGGCACTC CATGGCCATG GCGACCAACA 
ACATCTCCCT CATGGCGACA CTGAAGGCCC CTCTGAGGGT CCACATCACC TCACTGTTGC 
CCACCCCCCA GGACAACCTG GAGATCGTTC TGCACAGATG GGAGAACAAC AGCTGTGTTG 
AGAAGAAGGT CCTTGGAGAG AAGACTGGGA ATCCAAAGAA GTTCAAGATC AACTATACGG 
TGGCGAACGA GGCCACGCTG CTCGATACTG ACTACGACAA TTTCCTGTTT CTCTGCCTAC 
45 AGGACACCAC CACCCCCATC CAGAGCATGA TGTGCCAGTA CCTGGCCAGA GTCCTGGTGG 

AGGACGATGA GATCATGCAG GGATTCATCA GGGCTTTCAG GCCCCTGCCC AGGCACCTAT 
GGTACTTGCT GGACTTGAAA CAGATGGAAG AGCCGTGCCG TTTCTAGCTC ACCTCCGCCT 
CCAGGAAGAC CAGACTCCCA CCCTTCCACA CCTCCAGAGC AGTGGGACTT CCTCCTGCCC 
TTTCAAAGAA TAACCACAGC T CAGAAGACG ATGACGTGGT CATCTGTGTC GCCATCCCCT 
50 TCCTGCTGCA CACCTGCACC ATTGCCATGG GGAGGCTGCT CCCTGGGGGC AGAGTCTCTG 
C CTTGGAGCAT G 

sequence 



MDIPQTKQDL ELPKLAGTWH SMAMATNNIS LMATLKAPLR VHITSLLPTP EDNLEIVLHR 60 

WENNSCVEKK VLGEKTGNPK KFKINYTVAN EATLLDTDYD NFLFLCLQDT TTPIQSMMCQ 12 0 
YLARVLVEDD EIMQGFIRAF RPLPRHLWYL LDLKQMEEPC RF 

Seq ID NO: 552 DNA sequence 

Nucleic Acid Accession #: NM_006500.1 

Coding sequence: 2 7.. 19 67 

1 11 21 31 41 51 

I I 1 I I I 

ACTTGCGTCT CGCCCTCCGG CCAAGCATGG GGCTTCCCAG GCTGGTCTGC GCCTTCTTGC 60 

TCGCCGCCTG CTGCTGCTGT CCTCGCGTCG CGGGTGTGCC CGGAGAGGCT GAGCAGCCTG 12 0 

CGCCTGAGCT GGTGGAGGTG GAAGTGGGCA GCACAGCCCT TCTGAAGTGC GGCCTCTCCC 180 

AGTCCCAAGG CAACCTCAGC CATGTCGACT GGTTTTCTGT CCACAAGGAG AAGCGGACGC 240 

TCATCTTCCG TGTGCGCCAG GGCCAGGGCC AGAGCGAACC TGGGGAGTAC GAG CAGCGGC 300 

TCAGCCTCCA GGACAGAGGG GCTACTCTGG CCCTGACTCA AGTCACCCCC CAAGACGAGC 360 

GCATCTTCTT GTGCCAGGGC AAGCGCCCTC GGTCCCAGGA GTACCGCATC CAGCTCCGCG 420 

TCTACAAAGC TCCGGAGGAG CCAAACATCC AGGTCAACCC CCTGGGCATC CCTGTGAACA 4 80 

GTAAGGAGCC TGAGGAGGTC GCTACCTGTG TAGGGAGGAA CGGGTACCCC ATTCCTCAAG 540 

TCATCTGGTA CAAGAATGGC CGGCCTCTGA AGGAGGAGAA GAACCGGGTC CACATTCAGT 600 

CGTCCCAGAC IGTGGAGTCG AGTGGTTTGT ACACCTTGCA GAGTATTCTG AAGGCACAGC 660 

TGGTTAAAGA AGACAAAGAT GCCCAGTTTT ACTGTGAGCT CAACTACCGG CTGCCCAGTG 72 0 

GGAAC CACAT GAAGGAGTCC AGGGAAGTCA CCGTCCCTGT TTTCTACCCG ACAGAAAAAG 780 

TGTGGCTGGA AGTGGAGCCC GTGGGAATGC TGAAGGAAGG GGACCGCGTG GAAATCAGGT 840 

GTTTGGCTGA TGGCAACCCT CCACCACACT TCAGCATCAG CAAGCAGAAC CCCAGCACCA 900 

GGGAGGCAGA GGAAGAGACA ACCAACGACA ACGGGGTCCT GGTGCTGGAG CCTGCCCGGA 960 

AGGAACACAG TGGGCGCTAT GAATGTCAGG CCTGGAACTT GGACACCATG ATATCGCTGC 1020 

TGAGTGAACC ACAGGAACTA CTGGTGAACT ATGTGTCTGA CGTCCGAGTG AGTCCCGCAG 1080 

' CCCCTGAGAG ACAGGAAGGC AGCAGCCTCA CCCTGACCTG TGAGGCAGAG AGTAGCCAGG 114 0 

ACCTCGAGTT CCAGTGGCTG AGAGAAGAGA CAGACCAGGT GCTGGAAAGG GGGCCTGTGC 1200 
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TTCAGTTGCA TGACCTGAAA CGGGAGGCAG GAGGCGGCTA TCGCTGCGTG GCGTCTGTGC 12 60 

CCAGCATACC CGGCCTGAAC CGCACACAGC TGGTCAAGCT GGCCATTTTT GGCCCCCCTT 132C 

GGATGGCATT CAAGGAGAGG AAGGTGTGGG TGAAAGAGAA TATGGTGTTG AATCTGTCTT 1380 

GTGAAGCGTC AGGGCACCCC CGGCCCACCA TCTCCTGGAA CGTCAACGGC ACGGCAAGTG 1440 

AACAAGACCA AGATCCACAG CGAGTCCTGA GCACCCTGAA TGTCCTCGTG ACCCCGGAGC 1500 

TGTTGGAGAC AGGTGTTGAA TGCACGGCCT CCAACGACCT GGGCAAAAAC ACCAGCATCC 1560 

TCTTCCTGGA GCTGGTCAAT TTAACCACCC TCACACCAGA CTCCAACACA ACCACTGGCC 1520 

TCAGCACTTC CACTGCCAGT CCTCATACCA GAGCCAACAG CACCTCCACA GAGAGAAAGC 1S80 

TGCCGGAGCC GGAGAGCCGG GGCGTGGTCA TCGTGGCTGT GATIGTGTGC ATCCTGGTCC 1740 

TGGCGGTGCT GGGCGCTGTC CTCTATTTCC TCTATAAGAA GGGCAAGCTG CCGTGCAGGC 1800 

GCTCAGGGAA GCAGGAGATC ACGCTGCCCC CGTCTCGTAA GACCGAACTT GIAGTTGAAG 1860 

TTAAGTCAGA TAAGCTCCCA GAAGAGATGG GCCTCCTGCA GGGCAGCAGC GGTGACAAGA 192 0 

GGGCTCCGGG AGACCAGGGA GAGAAATACA TCGATCTGAG GCAITAGCCC CGAATCACTT 1980 

CAGCTCCCTT CCCTGCCTGG ACCATTCCCA GCTCCCTGCT CACTCTTCTC TCAGCCAAAG 2 040 

CCTCCAAAGG GACTAGAGAG AAGCCTCCTG CTCCCCTCAC CTGCACACCC CCTTTCAGAG 2100 

GGCCACTGGG TTAGGACCTG AGGACCTCAC TTGGCCCTGC AAGCCGCTTT TCAGGGACCA 2160 

C ATCTCCTCCA CGTTGAGTGA AGCTCATCCC AAGCAAGGAG CCCCAGTCTC 2220 

I AGGAGAGTTT CTTGCAGAAC GTGTTTTTTC TTTACACACA TTATGGCTGT 2280 

AAATACCTGG CTCCTGCCAG CAGCTGAGCT GGGTAGCCTC TCTGAGCTGG TTTCCTGCCC 2340 

CAAAGGCTGG CTTCCACCAT CCAGGTGCAC CACTGAAGTG AGGACACACC GGAGCCAGGC 2400 

GCCTGCTCAT GTTGAAGTGC GCTGTTCACA CCCGCTCCGG AGAGCACCCC AGCGGCATCC 2460 

AGAAGCAGCT GCAGTGTTGC TGCCACCACC CTCCTGCTCG CCTCTTCAAA GTCTCCTGTG 2520 

ACATTTTTTC TTTGGTCAGA AGCCAGGAAC TGGTGTCATT CCTTAAAAGA TACGTGCCGG 2580 

GGCCAGGTGT GGTGGCTCAC GCCTGTAATC CCAGCACTTT GGGAGGCCGA GGCGGGCGGA 2 640 

TCACAAAGTC AGGACGAGAC CATCCTGGCT AACACGGTGA AACCCTGTCT CTACTAAAAA 2700 

TACAAAAAAA AATTAGCTAG GCGTAGTGGT TGGCACCTAT AGTCCCAGCT ACTCGGAAGG 2760 

CTGAAGCAGG AGAATGGTAT GAATCCAGGA GGTGGAGCTT GCAGTGAGCC GAGACCGTGC 2 820 

' CACTGCACTC CAGCCTGGGC AACACAGCGA GACTCCGTCT CGAGGAAAAA AAAAGAAAAG 2880 

ACGCGTACCT GGGGTGAGGA AGCTGGGCGC TGTTTTCGAG TTCAGGTGAA TTAGCCTCAA 2 940 

TCCCCGTGTT CACTTGCTCC CATAGCCCTC TTGATGGATC ACGTAAAACT GAAAGGCAGC 3000 

GGGGAGCAGA CAAAGATGAG GTCTACACTG TCCTTCATGG GGATTAAAGC TATGGTTATA 3060 

TTAGCACCAA ACTTCTACAA ACCAAGCTCA GGGCCCCAAC CCTAGAAGGG CCCAAATGAG 3120 

AGAATGGTAC TTAGGGATGG AAAACGGGGC CTGGCTAGAG CTTCGGGTGT GTGTGTCTGT 3180 

CTGTGTGTAT GCATACATAT GTGTGTATAT ATGGTTTTGT CAGGTGTGTA AATTTGCAAA 3240 

TTGTTTCCTT TATATATGTA TGTATATATA TATATGAAAA TATATATATA TATGAAAAAT 3300 

AAAGCTTAAT TGTCCCAGAA AATCATACAT TGCTTTTTTA TTCTACATGG GTACCACAGG 3360 

AACCTGGGGG CCTGTGAAAC TACAACCAAA AGGCACACAA AACCGTTTCC AGTTGGCAGC 342 0 

AGAGATCAGG GGTTACCTCT GCTTCTGAGC AAATGGCTCA AGCTCTACCA GAGCAGACAG 3480 

CTACCCTACT TTTCAGCAGC AAAACGTCCC GTATGACGCA GCACGAAGGG CCTGGCAGGC 354 0 
TGTTAGCAGG AGCTATGTCC CTTCCTATCG TTTCCGTCCA CTT 

Seq ID NO : 553 Protein sequence 

I 11 21 31 41 51 

II III 

GLPRLVCAFL LAACCCCPRV AGVPGEAEQP APELVEVEVG STALLKCGLS QSQGNLSHVD 6 0 

WFSVHKEKRT LIFRVRQGQG QSEPGEYEQR LSLQDRGATL ALTQVTPQDE RIFLCQGKRF 12 0 

RSQEYRIQLR VYKAPEEPNI QVNPLGIPVN SKEPEEVATC VGRNGYPIPQ VIWYKNGRPL 180 

KEEKNRVHIQ SSQTVESSGL YTLQSILKAQ LVKEDKDAQF YCELNYRLPS GNHMKESREV 240 

TVPVFYPTEK VWLEVEPVGM LKEGDRVEIR CLADGNPPPH FSISKQNFST REAEEETTND 300 

NGVLVLEPAR KEHSGRYECQ AWNLDTMISL LSEPQELLVN YVSDVRVSPA APERQEGSSL 360 

TLTCEAESSQ DLEFQWLREE TDQVLERGPV LQLHDLKREA GGGYRCVASV PSIPGLNRTQ 42 0 

LVKLAIFGPP WMAFKERKVW VKENMVLNLS CEASGHPRPT ISWNVNGTAS EQDQDPQRVL 480 

STLNVLVTPE LLETGVECTA SNDLGKNTSI LFLELWLTT LTPDSNTTTG LSTSTASPHT 540 

RANSTSTERK LPEPESRGW IVAVIVCILV LAVLGAVLYF LYKKGKLPCR RSGKQEITLP 600 
PSRKTELWE VKSDKLPEEM GLLQGSSGDK RAPGDQGEKY IDLRH 

Seq ID NO: 554 DNA sequence 

Nucleic Acid Accession ft: NM_003183.3 

Coding sequence: 165. .2639 

1 11 21 31 41 51 

I I I I I I 

TCGAGCCTGG CGGTAGAATC TTCCCAGTAG GCGGCGCGGG AGGGAAAAGA GGATTGAGGG 60 

GCTAGGCCGG GCGGATCCCG TCCTCCCCCG ATGTGAGCAG TTTTCCGAAA CCCCGTCAGG 12 0 

CGAAGGCTGC CCAGAGAGGT GGAGTCGGTA GCGGGGCCGG GAACATGAGG CAGTCTCTCC 180 

TATTCCTGAC CAGCGTGGTT CCTTTCGTGC TGGCGCCGCG ACCTCCGGAT GACCCGGGCT 240 

TCGGCCCCCA CCAGAGACIC GAGAAGCTTG ATTCTTTGCT CICAGACTAC GATATTCTCT 300 

CTTTATCTAA TATCCAGCAG CATTCGGTAA GAAAAAGAGA TCTACAGACT T CAAC ACATG 360 

TAGAAACACT ACTAACTTTT TCAGCTTTGA AAAGGCATTT TAAATTATAC CTGACATCAA 42 0 

GTACTGAACG' TTTTTCACAA AATTTCAAGG TCGTGGTGGT GGATGGTAAA AACGAAAGCG 4 80 

AGTACACTGC AAAATGGCAG GACTTCTTCA CTGGACACGT GGTTGGTGAG CCTGACTCTA 540 

GGGTTCTAGC CCACATAAGA GATGATGATG TTATAATCAG AATCAACACA GATGGGGCCG 600 

AATATAACAT AGAGCCACTT TGGAGATTTG TTAATOATAC CAAAGACAAA AGAATGTTAG 660 

TTTATAAATC TGAAGATATC AAGAATGTTT CACGTTTGCA GTCTCCAAAA GTGTGTGGTT 72 0 

ATTTAAAAGT GGATAATGAA GAGTTGCTCC CAAAAGGGTT AGTAGACAGA GAACCACCTG 780 

AAGAGCTTGT TCATCGAGTG AAAAGAAGAG CTGACCCAGA TCCCATGAAG AACACGTGTA 84 0 

AATTATTGGT GGTAGCAGAT CATCGCTTCT ACAGATACAT GGGCAGAGGG GAAGAGAGTA 90 0 

CAACTACAAA TTACTTAATA GAGCTAATTG ACAGAGTTGA TGACATCTAT CGGAACACTT 960 

CATGGGATAA TGCAGGTTTT AAAGGCTATG GAATACAGAT AGAGCAGATT CGCATTCTCA 102 0 

AGTCTCCACA AGAGGTAAAA CCTGGTGAAA AGCACTACAA CATGGCAAAA AGTTACCCAA 1080 

ATGAAGAAAA GGATGCTTGG GATGTGAAGA TGTTGCTAGA GCAATTTAGC TTTGATATAG 1140 

CTGAGGAAGC ATCTAAAGTT TGCTTGGCAC ACCTTTTCAC ATACCAAGAT TTTGATATGG 12 00 

GAACTCTTGG ATTAGCTTAT GTTGGCTCTC CCAGAGCAAA CAGCCATGGA GGTGTTTGTC 12 60 

CAAAGGCTTA TTATAGCCCA GTTGGGAAGA AAAATATCTA TTTGAATAGT GGTTTGACGA 132 0 

G CACAAAGAA TTATGGTAAA ACCATCCTTA CAAAGG AAG C TGACCTGGTT ACAACTCATG 1380 



395 



WO 02/086443 PCT/US02/12476 

AATTGGGACA TAATTTTGGA GCAGAACATG ATCCGGATGG TCTAGCAGAA TGTGCCCCGA 1440 

ATGAGGACCA GGGAGGGAAA TATGTCATGT ATCCCATAGC TGTGAGTGGC GATCACGAGA 1500 

ACAATAAGAT GTTTTCAAAC TGCAGTAAAC AATCAATCTA TAAGACCATT GAAAGTAAGG 1560 

CCCAGGAGTG TTTTCAAGAA CGCAGCAATA AAGTTTGTG3 GAACTCGAGG GTGGATGAAG 1S20 

GAGAAGAGTG TGATCCTGGC ATCATGTATC TGAACAACGA CACCTGCTGC AACAGCGACT 1S80 

GCACGTTGAA GGAAGGTGTC CAGTGCAGTG ACAGGAACAG TCCTTGCTGT AAAAACTGTC 1740 

AGTTTGAGAC TGCCCAGAAG AAGTGCCAGG AGGCGATTAA TGCTACTTGC AAAGGCGTGT 1800 

CCTACTGCAC AGGTAATAGC AGTGAGTGCC CGCCTCCAGG AAATGCTGAA AATGACACTG I860 

TTTGCTTGGA TCTTGGCAAG TGTAAGGATG GGAAATGCAT CCCTTTCTGC GAGAGGGAAC 1920 

AGCAGCTGGA GTCCTGTGCA TGTAATGAAA CTGACAACTC CTGCAAGGTG TGCTGCAGGG 1980 

G CCGCTGTGTG CCCTATGTCG ATGCTGAACA AAAGAACTTA TTTTTGAGGA 2 040 

A GGATTTTGTG ACATGAATGG CAAATGTGAG AAACGAGTAC 2100 

T TGAACGATTT TGGGATTTCA TTGACCAGCT GAGCATCAAT ACTTTTGGAA 2160 

AGTTTTTAGC AGACAACATC GTTGGGTCTG TCCTGGTTTT CTCCTTGATA TTTTGGATTC 2220 

CTTTCAGCAT TCTTGTCCAT TGTGTGGATA AGAAATTGGA TAAACAGTAT GAATCTCTGT 2280 

CTCTGTTTCA CCCCAGTAAC GTCGAAATGC ( fl -it ATTCT3CA TCGGTTCGCA 2340 

TTATCAAACC CTTTCCTGCG CCCCAGACTC CAGGCCGCCT GCAGCCTGCC CCTGTGATCC 2400 

CTTCGGCGCC AGCAGCTCCA AAACTGGACC ACCAGAGAAT GGACACCATC CAGGAAGACC 2460 

CCAGCACAGA CTCCCATATG GACGAGGATG GGTTTGAGAA GGACCCCTTC CCAAATAGCA 2 520 

GCACAGCTGC CAAGTCATTT GAGGATCTCA CGGACCATCC GGTCGCCAGA AGTGAAAAGG 2580 

CTGCCTCCTT TAAACTGCAG CGTCAGAATC GTGTTAACAG CAAAGAAACA GAGTGCTAAT 2 540 

TTAGTTCTCA GCTCTTCTGA CTTAAGTGTG CAAAATATTT TTATAGATTT GACCTACAAA 2700 

TCAATCACAG CTTGTATTTT GTGAAGACTG GGAAGTGACT TAGCAGATGC TGGTCATGTG 2760 

TTTGAACTTC CTGCAGGTAA ACAGTTCTTO TGTGGTTTGG CCCITCTCCT TTTGAAAAGG 2820 

[TTGAG GCTTTCAGGT TTTAGTTTTT AAAATATCTT 2 880 

\ATACA GCTGGATTGG GTTATGAATA TTTACGTTTT 2 940 

TGTAAATTAA TCTTTTATAT TGATAACAGC ACTGACTAGG GAAATGATCA GTTTTTTTTT 3000 

ATACACTGTA ATGAACCGCT GAATATGAAG CATTTGGCAT TTATTTGTGA GAAAAGTGGA 3060 

ATAGTTTTTT TTTTTTTTTT TTTTTTTTCC CTTCAACTAA AAACAAAGGA GATAAATTTA 3120 

GTATACATTG TATCTAAATT GTGGGTCTAT TTCTAGTTAT TACCCAGAGT TTTTATGTAG 3180 

T ATATATCTAA ATTTAGAAAT CATTTGGGTT AATATGGCTC TTCATAATTC 3240 

T GCTCAGAACC TAACCACTAC CTTACAGTGA GGGCTATACA TGGTAGCCAG 3 30 0 

T GGAATCTACC AACTGTTTAG GGCCCTGATT TGCTGGGCAG TTTTTCTGTA 3360 

TTTTATAAGT ATCTTCATGT ATCCCTGTTA CTGATAGGGA TACATGTCTT AGAAAATTCA 342 0 

CTATTGGCTC3 GGAGTGGTGG CTCATGCCTG TAATCCCAGC ACTTGGAGAG GCTGAGGTTG 3480 
CGCCACTACA CTCCAGCCTG GGTGACAGAG TGAGATCTGC CTC 

Seq ID NO: 555 Protein sequence 

i i 1 i 1 T T T 

MRQSLLFLTS WPFVLAPRP PDDPGFGPHO RLEKLDSLLS DYDILSLSNI QQHSVRKRDL GO 

QTSTHVETLL TFSALKRHFK LYLTSSTERF SQNFKWWD GKNESEYTAK WQDFFTGHW 12 0 

GEPDSRVLAII IRDDDVIIRI NTDGAEYNIE PLWRFVNDTK DKRMLVYKSE DIKNVSRLQS 180 

PKVCGYLKVD NEELLPKGLV DREPPEELVH RVKRRADPDP MKNTCKLLW ADHRFYRYMG 24 0 

RGEESTTTNY LIELIDRVDD IYRNTSWDNA GFKGYGIQIE QIRILK3PQE VKPGEKHYNM 300 
] AWDVKMLLEQ FSFDIAEEAS KVCLAHLFTY QDFDMGTLGL A 

Y SPVGKKNIYL NSGLTSTKNY GKTILTKEAD LVTTHEI 

AECAPNEDQG GKYVMYPI AV SGDHENNKMF SNCSKQSIYK TIKSKAQECF QERSNKVCGN 
SRVDEGEECD PGIMYLNNDT CCNSDCTLKE GVOCSDRNSF CCKKCQFETA QKKCQEAINA 
TCKGVSYCTG NSSECPPPGN AENDTVCLDL GKCKDGKCIE FCSREQQLES CACNETDNSC 
KVCCRDL3GR CVPYVDAEQK NLFLRKGKPC TVGFCDMNGK CEKRVQDVIE RFWDFIDQLS 
INTFGKFLAD NIVGSVLVFS LIFWIPFSIL VHCVDKKLDK QYESLSLFHP 3KVEMLSSMD 
SASVRIIKEF PAPQTPGRLQ PAPVIPSAEA APKLDHQRMD TIQEDPSTDS HMDEDGFEKD 
PFPNSSTAAK SFEDLTDHPV ARSEKAASFK LQRQNRVNSK ETEC 

Seq ID NO: 556 DNA sequence 
Nucleic Acid Accession #: NM_021832.1 
Coding sequence: 164.. 2248 



TCGAGCCTGG CGGTAGAATC TTCCCAGTAG G 



120 
180 



ATTCCTGACC AGCGTGGTTC CTTTCGTGCT GGCGCCGCGA CCTCCGGATG ACCCGGGCTT 
CGGCCCCCAC CAGAGACTCG AGAAGCTTGA TTCTITGCTC TCAGACTACG ATATTCTCTC 
TTTATCTAAT ATCCAGCAGC ATTCGGTAAG AAAAAGAGAT CTACAGACTT CAACACATGT 
AGAAACACTA CTAACTTTTT CAGCTTTGAA AAGGCATTTT AAATTATACC IGACATCAAG 
TACTGAACGT TTTTCACAAA ATTTCAAGGT CGTGGTGGTG GATGGTAAAA ACGAAAG CG A 

GTACACTGTA AAATGGCAGG ACTTCTTCAC TGGACACGIG GTTGGTGAGC CTGACTCTAG S4U 

GGTTCTAGCC CACATAAGAG ATGATGATCT TATAATCAGA ATCAACACAG ATGGGGCCGA 60 0 

ATATAACATA GAGCCACTTT GGAGATTTGT TAATGATACC AAAGACAAAA GAATGTTAGT 660 

TTATAAATCT GAAGATATCA AGAATGTTTC ACGTTTGCAG TCTCCAAAAG TGTGTGGTTA 72 0 

TTTAAAAGTG GATAATGAAG AGTTGCTCCC AAAAGGGTTA GTAGACAGAG AACCACCTGA 780 

AGAGCTTGTT CATCGAGTGA AAAGAAGAGC IGACCCAGAT CCCATGAAGA ACACGTGTAA 840 

ATTATTGGTG GTAGCAGATC ATCGCTTCTA CAGATACATG GGCAGAGGGG AAGAGAGTAC 900 

AACTACAAAT TACTTAATAG AGCTAA1TCA CAGAGTTGAT GACATCTATC GGAACACTTC 960 

ATGGGATAAT GCAGGTTTTA AAGGCTATGG AATACAGATA GAGCAGATTC GCATTCTCAA 102 0 

GTCTCCACAA GAGGTAAAAC CTGGTGAAAA GCACTACAAC ATGGCAAAAA GTTACCCAAA 1080 

TGAAGAAAAG GATGCTTGGG ATGTGAAGAT GTTGCTAGAG CAATTTAGCT TTGATATAGC 114 0 
TGAGGAAGCA TCTAAAGTTT GCTTGGCACA CCTTTTCACA TACCAAGATT T 
AACTCTTGGA TTAGCTTATG TTGGCTCTCC CAGAGCAAAC AGCCATGGAG G 

AAAGGCTTAT TATAGCCCAG TTGGGAAGAA AAATATCTAT TTGAATAGTG GTTTGACGAG 132 0 

CACAAAGAAT TATGGTAAAA CCATCCTTAC AAAGGAAGCT GACCTGGTTA CAACTCATGA 1380 

ATTGGGACAT AATTTTGGAG CAGAACATGA TCCGGATGGT CTAGCAGAAT GTGCCCCGAA 144 0 
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TGAGGACCAG GGAGGGAAAT ATGTCATGTA TCCCATAGCT GTGAGTGGCG ATCACGAGAA 1500 

CAATAAGATG TTTTCAAACT GCAGTAAACA ATCAATCTAT AAGACCATTG AAAGTAAGGC 1560 

CCAGGAGTGT TTTCAAGAAC GCAGCAATAA AGTTTGTGGG AACTCGAGGG TGGATGAAGG 1S20 

AGAAGAGTGT GATCCTGGCA TCATGTATCT GAACAACGAC ACCTGCTGCA ACAGCGACTG 1S8 0 

CACGTTGAAG GAAGGTGTCC AGTGCAGTGA CAGGAACAGT CCTTGCTGTA AAAACTGTCA 174 0 

CTACTGCACA GGTAATAG CA GTGAGTGCCC GCCTCCAGGA AATGCTGAAG ATGACACTGT 1860 

TTGCTTGGAT CTTGGCAAGT GTAAGGATGG GAAATGCATC CCTTTCTGCG AGAGGGAACA 1920 

GCAGCTGGAG TCCTGTGCAT GTAATGAAAC TGACAACTCC TGCAAGGTGT GCTGCAGGGA 1980 

CCTTTCCGGC CGCTGTGTGC CCTATGTCGA TGCTGAACAA AAOAACTTAT TTTTGAGGAA 2 04 0 

AGGAAAGCCC TGTACAGTAG GATTTTGTGA CATGAATGGC AAATGTGAGA AACGAGTACA 2100 

GGATGTAATT GAACGATTTT GGGATTTCAT TGACCAGCTG AGCATCAATA CTTTTGGAAA 2160 

GTTTTTAGCA GACAACATCG TTGGGTCTGT CCTGGTTTTC TCCTTGATAT TTTGGATTCC 2220 

TTTCAGCATT CTTGTCCATT GTGTGTAACG TCGAAATGCT GAGCAGCATG GATTCTGCAT 2280 

CGGTTCGCAT TATCAAACCC TTTCCTGCGC CCCAGACTCC AGGCCGCCTG CAGCCTGCCC 2 34 0 

CTGTGATCCC TTCGGCGCCA GCAGCTCCAA AACTGGACCA CCAGAGAATG GACACCATCC 2400 

AGGAAGACCC CAGCACAGAC TCACATATGG ACGAGGATGG GTTTGAGAAG GACCCCTTCC 2460 

CAAATAGCAG CACAGCTGCC AAGTCATTTG AGGATCTCAC GGACCATCCG GTCACCAGAA 2 52 0 

GTGAAAAGGC TGCCTCCTTT AAACTGCAGC GTCAGAATCG TGTTGACAGC AAAGAAACAG 2580 

AGTGCTAATT TAGTTCTCAG CTCTTCTGAC TTAAGTGTGC AAAATATTTT TATAGATTTG 2640 

ACCTACAATC AATCACAGCT TATATTTTGT GAAGACTGGG AAGTGACTTA GCAGATGCTG 2700 

GTCATGTGTT TGAACTTCCT GCAGGTAAAC AGTTCTTGTG TGGTTTGGCC CTTCTCCTTT 2760 

TGAAAAGGTA AGGTGAAGGT GAATCTAGCT TATTTTGAGG CTTTCAGGTT TTAGTTTTTA 2820 

AAATATCTTT TGACCTGTGG TGCAAAAGCA GAAAATACAG CTGGATTGGG TTATGAGTAT 2880 

TTACGTTTTT GTAAATTAAT CTTTTATATT GATAACAGGC ACTGACTAGG GAAATGATCA 294 0 

GTTTTTTTTT ATACACTGTA ATGAACCGCT GAATATGAAG CATTTGGCAT TTATTTGTGA 300 0 

GAAAAGTGGA ATAGTTTTTT TTTTTTTTTT TTTTTTTTGC CTTCAACTAA AAACAAAGGA 3060 

GATAAATTTA GTATACATTG TATCTAAATT GTGGGTCTAT TTCTAGTTAT TACCCAGAGT 312 0 

TTTTATGTAG CAGGGAAAAT ATATATCTAA ATTTAGAAAT CATTTGGGTT AATATGGCTC 3180 

TTCATAATTC TAAGACTAAT GCTCAGAACC TAACCACTAC CTTACAGTGA GGGCTATACA 3240 

TGGTAGCCAG TTGAATTTAT GGAATCTACC AACTGTTTAG GGCCCTGATT TGCTGGGCAG 3300 

TTTTTCTGTA TTTTATAAGT ATCTTCATGT ATCCCTGTTA CTGATAGGGA TACATGTCTT 3360 

AGAAAATTCA CTATTGGCTG GGAGTGGTGG CTCATGCCTG TAATCCCAGC ACTTGGAGAG 342 0 
3421 GCTGAGGTTG CGCCACTACA CTCCAGCCTG GGTGACAGAG TGAGATCTGC CTC 

Seq ID NO: 557 Protein sequence 



1 11 21 31 41 51 

I I I I I I 

MRQSLLFLTS WPFVLAPRP PDDPGFGPHQ RLEKLDSLLS DYDILSLSNI QQHSVRKRDL 60 

QTSTHVBTLL TFSALKRHFK LYLTSSTERF SONFKWWD GKNESEYTVK WQD FFTGHW 120 

GEPDSRVLAH IRDDDVIIRI NTDGAEYNIE FLWRFVNDTK DKRMLVYKSE DIKNVSRLQS 180 

PKVCGYLKVD NEELLPKGLV DREPPEELVH RVKRRADPDP MKNTCXLLW ADHRFYRYMG 240 

RGEESTTTNY LIELIDRVDD IYRNTSWDNA GFKGYGIQIE QIRILKSPQE VKPGEKHYNK. 300 

AKSYPNEEKD AWDVKMLLEO FSFDIAEEAS KVCbAHLFTY QDFDMGTLGL AYVGSPRANS 360 

HGGVCPKAYY SPVGKKNI YL NSGLTSTKNY GKTILTKEAD LVTTHELGHN FGAEHDPDGL 42 0 

AECAPNEDQG GKYVMYPIAV SGDHENNKMF SNCSKQSIYK TIESKACECF OERSNKVCGN 480 

SRVDEGEECD PGIMYLNNDT CCNSDCTLKE GVQCSDRN3F CCKNCQFETA QKKCQEAINA 540 

TCKGVSYCTG NSSECPPPGN AEDDTVCLDL GKCKDGKCIP FCEREQCLES CACNETDNSC ■ 600 

KVCCRDLSGR CVPYVDAEQK NLFLRKGKPC TVGFCDMNGK CEKRVQDVIE RFWDFIDQLS 660 
INTFGKFLAD NIVGSVLVFS LIFWIPFS1L VHCV 

Seq ID NOi 558 DNA sequence 

Nucleic Acid Accession #: NM_004994.1 

Coding sequence: 2 0.. 2 14 3 

i i 1 r i 1 I 1 r 

AGACACCTCT GCCCTCACCA TGAGCCTCTG GCAGCCCCTG GTCCTGGTGC TCCTGGTGCT 60 

GGGCTGCTGC TTTGCTGCCC CCAGACAGCG CCAGTCCACC CTTGTGCTCT TCCCTGGAGA 12 0 

CCTGAGAACC AATCTCACCG ACAGGCAGCT GGCAGAGGAA TACCTGTACC GCTATGGTTA 180 

CACTCGGGTG GCAGAGATGC GTGGAGAGTC GAAATCTCTG GGGCCTGCGC TGCTGCTTCT 240 

CCAGAAGCAA CTGTCCCTGC CCGAGACCGG TGAGCTGGAT AGCGCCACGC TGAAGGCCAT 300 

GCGAACCCCA CGGTGCGGGG TCCCAGACCT GGGCAGATTC CAAACCTTTG AGGGCGACCT 360 

CAAGTGGCAC CACCACAACA TCACCTATTG GATCCAAAAC TACTCGGAAG ACTTGCCGCG 420 

GGCGGTGATT GACGACGCCT TTGCCCGCGC CTTCGCACTG TGGAGCGCGG TGACGCCGCT 480 

CACCTTCACT CGCGTGTACA GCCGGGACGC AGACATCGTC ATCCAGTTTC GTGTCGCGGA 54 0 

GCACGGAGAC GGGTATCCCT TCGACGGGAA GGACGGGCTC CTGGCACACG CCTTTCCTCC 600 

TGGCCCCGGC ATTCAGGGAG ACGCCCATTT CGACGATGAC GAGTTGTGGT CCCTGGGCAA 660 

GGGCGTCGTG GTTCCAACTC GGTTTGGAAA CGCAGATGGC GCGGCCTGCC ACTTCCCCTT 720 

CATCTTCGAG GGCCGCTCCT ACTCTGCCTG CACCACCGAC GGTCGCTCCG ACGGCTTGCC 780 

CTGGTGCAGT ACCACGGCCA ACTACGACAC CGACGACCGG TTTGGCTTCT GCCCCAGCGA 84 0 

GAGACTCTAC ACCCGGGACG GCAATGCTGA TGGGAAACCC TGCCAGTTTC CATTCATCTT 900 

CCAAGGCCAA TCCTACTCCG CCTGCACCAC GGACGGTCGC TCCGACGGCT ACCGCTGGTG 960 

CGCCACCACC GCCAACTACG ACCGGGACAA GCTCTTCGGC TTCTGCCCGA CCCGAGCTGA 1020 

CTCGACGGTG ATGGGGGGCA ACTCGGCGGG GGAGCTGTGC GTCTTCCCCT TCACTTTCCT 1080 

3 TACTCGACCT GT AC CAGCGA GGGCCGCGGA GATGGGCGCC TCTGGTGCGC 114 0 

A GCGACAAGAA GTGGGGCTTC TGCCCGGACC AAGGATACAG 12 00 

C ATGAGTTCGG CCACGCGCTG GGCTTAGATC ATTCCTCAGT 12 60 

3 CTCATGTACC CTATGTACCG CTTCACTGAG GGGCCCCCCT TGCATAAGGA 1320 

C ACCTCTATGG TCCTCGCCCT GAACCTGAGC CACGGCCTCC 1380 

AACCACCACC ACACCGCAGC CCACGGCTCC CCCGACGGTC TGCCCCACCG GACCCCCCAC 1440 

TGTCCACCCC TCAGAGCGCC CCACAGCTGG CCCCACAGGT CCCCCCTCAG CTGGCCCCAC 1500 

AGGTCCCCCC ACTGCTGGCC CTTCTACGGC CACTACTGTG CCTTTGAGTC CGGTGGACGA 1560 
TGCCTGCAAC GTGAACATCT TCGACGCCAT CGCGGAGATT G 
CAAGGATGGG AAGTACTGGC GATTCTCTGA GGGCAGGGGG A 
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CCTTATCGCC GACAAGTGGC CCGCGCTGCC CCGCAAGCTG GACTCGGTCT TTGAGGAGCC 1740 

GCTCTCCAAG AAGCTTTTCT TCTTCTCTGG GCGCCAGGTG TGGGTGTACA CAGGCGCGTC 1800 

GGTGCTGGGC CCGAGGCGTC 1GGACAAGCT GGGCCTGGGA GCCGACGTGG CCCAGGTGAC I8 60 

CGGGGCCCTC CGGAGTGGCA GGGGGAAGAT GCTGCTGTTC AGCGGGCGGC GCCTCTGGAG 1920 

GTTCGACGTG AAGGCGCAGA TGGTGGATCC CCGGAGCGCC AGCGAGGTGG ACCGGATGTT 1980 

CCCCGGGGTG CCTTTGGACA CGCACGACGT CTTCCAGTAC CGAGAGAAAG CCTATTTCTG 2 040 

CCAGGACCGC TTCTACTGGC GCGTGAGTTC CCGGAGTGAG TTGAACCAGG TGGACCAAGT 2100 

GGGCTACGTG ACCTATGACA TCCTGCAGTG CCCTGAGGAC TAGGGCTCCC GTCCTGCTTT 2 ISO 

GCAGTGCCAT GTAAATCCCC ACTGGGACCA ACCCTGGGGA AGGAGCCAGT TTGCCGGATA 2 220 

CAAACTGGTA TTCIGTTCTG GAGGAAAGGG AGGAGTGGAG GTGGGCTGGG CCCTCTCTTC 2 2 80 
TCACCTTTGT TTTTTGTTGG AGTGTTTCTA ATAAACTTGG ATTCTCTAAC CTTT 

Protein Accession ft: NP_0 04985.1 

1 11 21 31 41 51 

HSLWQPLVLV LLVLGCCFAA PRQRQSTLVL FPGDLRTNLT DRQLAEEY1Y RYGYTRVAEM 60 

RGESKSLGPA LLLLQKQLSL PETGELDSAT LKAMRTPRCG VPDLGRFQTF EGDLKWHHHH 120 

ITYWIQNYSE DLPRAVIDDA FARAFALWSA VTPLTFTRVY SRDADIVIQF GVAEHGDGYP 180 

FDGKDGLLAH AFPPGPGIQG DAHFDDDELW SLGKGVWPT RFGNABGAAC HFPFIFSGRS 240 

YSACTTDGRS DGLPWCSTTA NYDTDDRFGF CPSERLYTRD GNADGKPCQF PFIFQGQSYS 3 00 

ACTTDGRSDG YRWCATTANY DRDKLFGFCP TRADSTVMGG NSAGELCVFF FTFLGKEYST 3 60 

CTSEGRGDGR LWCATTSNFD SDKKWGFCPD QGYSLFLVAA HEFGHALGLD HS3VPEALMY 42 0 

PMYRFTEGPP LHKDDVNG1R HLYGPRPEPE PRPPTTTTPQ PTAPPTVCPT GPPTVHPSER 480 

PTAGPTGPPS AGPTGPPTAG PSTATTVPLS PVDDACNVNI FDAIAEIGNQ LYLFKDGKYW 540 

RFSEGRGSRP QGPFLIADKW PALPRKLDSV FEEPLSKKLF FFSGRQVWVY TGASVLGPRR 600 

LDKLGLGADV AQVTGALRSG RGKMLLFSGR RLWRFDVKAQ MVDPRSASSV DRMFPGVFLD 660 
THDVFQYREK AYFCQDRFYW RVSSRSELNQ VDQVGYVTYD ILQCPED 

Seq ID NO: 560 DNA sequence 

Nucleic Acid Accession It: NM_000213.1 

Coding sequence: 127.. 5385 

11 21 21 41 El 

CGCCCGCGCG CTGCAGCCCC ATCTCCTAGC GGCAGCCCAG GCGCGGAGGG AGCGAGTCCG 60 
CCCCGAGGTA GGTCCAGGAC GGGCGCACAG CAGCAGCCGA GGCTGGC 
AAGAGGAIGG CAGGGCCACG CCCCAGCCCA TGGGCCAGGC TGCTCCTGGC A 
AGCGTCAGCC TCTCTGGGAC CTTGGCAAAC CGCTGCAAGA AGGCCCC 

ACGGAGTGTG TCCGTGTGGA TAAGGACTGC GCCTACTGCA CAGACGAGAT GTTCAGGGAC 3 00 

CGGCGCTGCA ACACCCAGGC GGAGCTGCTG GCCGCGGGCT GCCAGCGGGA GAGCATCGTG 3 60 

GTCATGGAGA GCAGCTTCCA AATCACAGAG GAGACCCAGA TTGACACCAC CCTGCGGCGC 420 

AGCCAGATGT CCCCCCAAGG CCTGCGGGTC CGTCTGCGGC CCGGTGAGGA GCGGCATTTT 480 

GAGCTGGAGG TGTTTGAGCC ACTGGAGAGC CCCGTGGACC TGTACATCCT CATGGACTTC 540 

TCCAACTCCA TGTCCGATGA TCTGGACAAC CTCAAGAAGA TGGGGCAGAA CCTGGCTCGG 600 

GTCCTGAGCC AGCTCACCAG CGACTACACT ATTGGATTTG GCAAGTTTGT GGACAAAGTC 660 

AGCGTCCCGC AGACGGACAT GAGGCC1GAG AAGCTGAAGG AGCCCTGGCC CAACAGTGAC 72 0 

CCCCCCTTCT CCTTCAAGAA CGTCATCAGC CTGACAGAAG ATGTGGATGA GTTCCGGAAT 780 

AAACTGCAGG GAGAGCGGAT CTCAGGCAAC CTGGATGCTC CTGAGGGCGG CTTCGATGCC 840 

ATCCTGCAGA CAGCTGTGTG CACGAGGGAC ATTGGCTGGC GCCCGGACAG CACCCACCTG 900 

CTGGTCTTCT CCACCGAGTC AGCCTTCCAC TATGAGGCTG ATGGCGCCAA CGTGCTGGCT 360 

GGCATCATGA GCCGCAACGA TGAACGGTGC CACCTGGACA CCACGGGCAC CTACACCCAG 102 0 

TACAGGACAC AGGACTACCC GTCGGTGCCC ACCCTGGTGC GCCTGCTCGC CAAGCACAAC 108 0 

ATCATCCCCA TCTTTGCTGT CACCAACTAC TCCTATAGCT AC1ACGAGAA GCTTCACACC 114 0 

TATTTCCCTG TCTCCTCACT GGGGGTGCTG CAGGAGGACT CGTCCAACAT CGTGGAGCTG 120 0 

CTGGAGGAGG CCTTCAATCG GATCCGCTCC AACCTGGACA TCCGGGCCCT AGACAGCCCC 1260 

CGAGGCCTTC GGACAGAGGT CACCTCCAAG ATGTTCCAGA AGACGAGGAC TGGGTCCTTT 132 0 

CACATCCGGC GGGGGGAAGT GGGTATATAC CAGGTGCAGC TGCGGGCCCT TGAGCACGTG 1380 

GATGGGACGC ACGTGTGCCA GCTGCCGGAG GACCAGAAGG GCAACATCCA TCTGAAACCT 1440 

TCCTTCTCCG ACGGCCTCAA GATGGACGCG GGCATCATCT GTGATGTGTG CACCTGCGAG 1500 

CTGCAAAAAG AGGTGCGGTC AGCTCGCTGC AGCTTCAACG GAGACTTCGT GTGCGGACAG 1560 

TGTGTGTGCA GCGAGGGCTG GAGTGGCCAG ACCTGCAACT GCTCCACCGG CTCTCTGAGT 162 0 

GACATTCAGC CCTGCCTGCG GGAGGGCGAG GACAAGCCGT GCTCCGGCCG TGGGGAGTGC 1680 

CAGTGCGGGC ACTGTGTGTG CTACGGCGAA GGCCGCTACG AGGGTCAGTT CTGCGAGTAT 1740 

GACAACTTCC AGTGTCCCCG CACTTCCGGG TTCCTCTGCA ATGACCGAGG ACGCTGCTCC 1800 

ATGGGCCAGT GTGTGTGTGA GCCTGGTTGG ACAGGCCCAA GCTGTGACTG TCCCCTCAGC 1860 

AATGCCACCT GCATCGACAG CAATGGGGGC ATCTG1AATG GACGTGGCCA CTGTGAGTGT 1920 

GGCCGCTGCC ACTGCCACCA GCAGTCGCTC TACACGGACA CCATCTGCGA GATCAACTAC 1980 

TCGGCGATCC ACCCGGGCCT CTGCGAGGAC CTACGCTCCT GCGTGCAGTG CCAGGCGTGG 2 040 

GGCACCGGCG AGAAGAAGGG • GCGCACGTGT GAGGAATGCA ACTTCAAGGT CAAGATGGTG 2100 

GACGAGCTTA AGAGAGCCGA GGAGGTGGTG GTGCGCTGCT CCTTCCGGGA CGAGGATGAC 2160 

GACTGCACCT ACAGCTACAC CATGGAAGGT GACGGCGCCC CTGGGCCCAA CAGCACTGTC 2 220 

CTGGTGCACA AGAAGAAGGA CTGCCCTCCG GGCTCCTTCT GGTGGCTCAT CCCCCTGCTC 2280 

CTCCTCCTCC TGCCGCTCCT GGCCCTGCTA CTGCTGCTAT GCTGGAAGTA CTGTGCCTGC 2 340 

TGCAAGGCCT GCCTGGCACT TCTCCCGTGC TGCAACCGAG GTCACATGGT GGGCTTTAAG 2400 

GAAGACCACT ACATGCTGCG GGAGAACCTG ATGGCCTCTG ACCACTTGGA CACGCCCATG 2460 

CTGCGCAGCG GGAACCTCAA GGGCCGTGAC GTGGTCCGCT GGAAGGTCAC CAACAACATG 2 520 

CAGCGGCCTG GCTTTGCCAC TCATGCCGCC AGCATCAACC CCACAGAGCT GG1GCCCTAC 2580 

GGGCTGTCCT TGCGCCTGGC CCGCCTTTGC ACCGAGAACC TGCTGAAGCC TGACAC1CGG 2 640 

GAGTGCGCCC AGCTGCGCCA GGAGGTGGAG GAGAACCTGA ACGAGGTCIA CAGGCAGATC 2700 

TCCGGTGTAC ACAAGCTCCA GCAGACCAAG TTCCGGCAGC AGCCCAATGC CGGGAAAAAG 2 760 

CAAGACCACA CCATTGTGGA CACAGTGCTG ATGGCGCCCC GCTCGGCCAA GCCGGCCCTG 2 82 0 

CTGAAGCTTA CAGAGAAGCA GG T GGAACAG AGGGCCTTCC ACGACCTCAA GGTGGCCCCC 2880 

GGCTACTACA CCCTCACTGC AGACCAGGAC GCCCGGGGCA TGGTGGAGIT CCAGGAGGGC 2 940 

GTGGAGCTGG TGGACGTACG GGTGCCCCTC T1TATCCGGC CTGAGGATGA CGACGAGAAG 3000 

CAGCTGCTGG TGGAGGCCAT CGACGTGCCC GCAGGCACTG CCACCCTCGG CCGCCGCCTG 3 060 
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GTAAACATCA CCATCATCAA GGAGCAAGCC AGAGACGTGG TGTCCTTTGA GCAGCCTGAG 312 0 

TTCTCGGTCA GCCGCGGGGA CCAGGTGGCC CGCATCCCTG TCATCCGGCG TGTCCTGGAC 3180 

GGCGGGAAGT CCCAGGTCTC CTACCGCACA CAGGATGGCA CCGCGCAGGG CAACCGG3AC 3240 

TACATCCCCG TGGAGGGTGA GCTGCTGTTC CAGCCTGGGG AGGCCTGGAA AGAGCTGCAG 33 0 0 

GTGAAGCTCC TGGAGCTGCA AGAAGTTGAC TCCCTCCTGC GC-GGCCGCCA GGTCCGCCGT 3360 

TTCCACGTCC AGCTCAGCAA CCCTAAGTTT GGGGCCCACC TGGGCCAGCC CCACTCCACC 342 0 

ACCATCATCA TCAGGGACCC AGATGAACTG GACCGGAGCT TCACGAGTCA GATGTTGTCA 3480 

TCACAGCCAC CCCCTCACGG CGACCTGGGC GCCCCGCAGA ACCCCAA7GC TAAGGCCGCT 3S4 0 

GGGTCCAGGA AGATCCATTT CAACTGGCTG CCCCCTTCTG GCAAGCCAAT GGGGTACAGG 3600 

GTAAAGTACT GGATTCAGGG TGACTCCGAA TCCGAAGCCC ACCTGCTCGA CAGCAAGGTG 3660 

CCCTCAGTGG AGCTCACCAA CCTGTACCCG TATTGCGACT ATGAGATGAA GGTGTGCGCC 372 0 

TACGGGGCTC AGGGCGAGGG ACCCTACAGC TCCCTGGTGT CCTGCCGCAC CCACCAGGAA 3780 

GTGCCCAGCG AGCCAGGGCG TCTGGCCTTC AATGTCGTCT CCTCCACGGT GACCCAGCTG 3840 

AGCTGGGCTG AGCCGGCTGA GACCAACGGT GAGATCACAG CCTACGAGGT CTGCTATGGC 390 0 

CTGGTCAACG ATGACAACCG ACCTATTGGG CCCATGAAGA AAGTGCTGGT TGACAACCCT 3960 

AAGAACCGGA TGCTGCTTAT TGAGAACCTT CGGGAGTCCC AGCCCTACCG CTACACGGTG 4 02 0 

AAGGCGCGCA ACGGGGCCGG CTGGGGGCCT GAGCGGGAGG CCATCATCAA CCTGGCCACC 4080 

CAGCCCAAGA GGCCCATGTC CATCCCCATC ATCCCTGACA TCCCTATCGT GGACGCCCAG 4140 

AGCGGGGAGG ACTACGACAG CTTCCTTATG TACAGCGATG ACGTTCTACG CTCTCCATCG 42 00 

GGCAGCCAGA GGCCCAGCGT CTCCGATGAC ACTGAGCACC TGGTGAATGG CCGGATGGAC 4260 

TTTGCCTTCC CGGGCAGCAC CAACTCCCTG CACAGGATGA CCACGACCAG TGCTGCTGCC 432 0 

TATGGCACCC ACCTGAGCCC ACACGTGCCC CACCGCGTGC TAAGCACATC CTCCACCCTC 4380 

ACACGGGACT ACAACTCACT GACCCGCTCA GAACACTCAC ACTCGACCAC ACTGCCGAGG 4440 

GACTACTCCA CCCTCACCTC CGTCTCCTCC CACGACTCTC GCCTGACTGC TGGTGTGCCC 4S00 

GACACGCCCA CCCGCCTGGT GTTCTCTGCC CTGGGGCCCA CATCTCTCAG AGTGAGCTGG 4560 

CAGGAGCCGC GGTGCGAGCG GCCGCTGCAG GGCTACAGTG TGGAGTACCA GCTGCTGAAC 4620 

GGCGGTGAGC TGCATCGGCT CAACATCCCC AACCCTGCCC AGACCTCGGT GGTGGTGGAA 4680 

GACCTCCTGC CCAACCACTC CTACGTGTTC CGCGTGCGGG CCCAGAGCCA GGAAGGCTGG 4740 

GGCCGAGAGC GTGAGGGTGT CATCACCATT GAATCCCAGG TGCACCCGCA GAGCCCACTG 4800 

TGTCCCCTGC CAGGCTCCGC CTTCACTTTG AGCACTCCCA GTGCCCCAGG CCCGCTGGTG 4860 

TTCACTGCCC TGAGCCCAGA CTCGCTGCAG CTGAGCTGGG AGCGGCCACG GAGGCCCAAT 4 92 0 

GGGGATATCG TCGGCTACCT GGTGACCTGT GAGATGGCCC AAGGAGGAGG GCCAGCCACC 4980 

GCATTCCGGG TGGATGGAGA CAGCCCCGAG AGCCGGCTGA CCGTGCCGGG CCTCAGCGAG B040 

AACGTGCCCT ACAAGTTCAA GGTGCAGGCC AGGACCACTG AGGGCTTCGG GCCAGAGCGC 5100 

GAGGGCATCA TCACCATAGA GTCCCAGGAT GGAGGACCCT TCCCGCAGCT GGGCAGCCGT 5160 

GCCGGGCTCT TCCAGCACCC GCTGCAAAGC GAGTACAGCA GCATCACCAC CACCCACACC 5220 

AGCGCCACCG AGCCCTTCCT AGTGGATGGG CCGACCCTGG GGGCCCAGCA CCTGGAGGCA 52 80 

GGCGGCTCCC TCACCCGGCA TGTGACCCAG GAGTTTGTGA GCCGGACACT GACCACCAGC 5340 

GGAACCCTTA GCACCCACAT GGACCAACAG TTCTTCCAAA CTTGACCGCA CCCTGCCCCA 5400 

CCCCCGCCAT GTCCCACTAG GCGTCCTCCC GACTCCTCTC CCGGAGCCTC CTCAGCTACT 54 60 

CCATCCTTGC ACCCCTGGGG GCCCAGCCCA CCCGCATGCA CAGAGCAGGG GCTAGGTGTC 5520 

TCCTGGGAGG CATGAAGGGG GCAAGGTCCG TCCTCTGTGG GCCCAAACCT ATTTGTAACC 5580 

AAAGAGCTGG GAGCAGCACA AGGACCCAGC CTTTGTTCTG CACTTAATAA ATGGTTTTGC 5 640 
TACTG 

Seq ID NO: 561 Protein sequence 
Protein Accession ft: NP_000204.l 

1 11 21 31 41 51 

I I I I" I I 

MAGPRPSPWA RLLLAALISV SLSGTLANRC KKAPVKSCTS CVRVDKDCAY CTDEMFRDRR SO 

CNTQAELLAA GCQRESIWM ESSFQITEET QIDTTLRRSQ MSPQGLRVRL RPGEERHFEL 120 

EVFEPLESPV DLYILMDFSN SMSDDLDNLK KMGQNLARVL SQLTSDYTIG FGKFVDKVSV 180 

PQTDMRPEKL KEPWPNSDPP FSFKNVISLT EDVDEFRNKL QGERISGNLD APEGGFDAIL 240 

QTAVCTRDIG WRPDSTHLLV FSTESAFHYE ADGANVLAGI MSRNDERCHL DTTGTYTQYR 300 

TQDYPSVPTL VRLLAKHNI I PIFAVTNYSY SYYEKLHTYF PVSSLGVLQE DSSNIVELLE 360 

EAFNRIRSNL DIRALDSPRG LRTEVTSKMF QKTRTGSFHI RRGEVGIYQV QLRALEHVDG 420 

THVCQLPEDQ KGNIHLKPSF SDGLKMDAGI ICDVCTCELQ KEVRSARCSF NGDFVCGQCV 480 

CSEGWSGQTC NCSTGSLSDI QPCLREGEDK PCSGRGECQC GHCVCYGEGR YEGQFCEYDN 540 

FQCPRTSGFL CNDRGRCSMG QCVCEPGWTG PSCDCPLSNA TCIDSNGGIC NGRGHCECGR 600 

CHCHQQSLYT DTICEINYSA IHPGLCEDLR SCVQCQAWGT GEKKGRTCEE CNFKVKMVDE 650 

LKRAEEWVR CSFRDEDDDC TYSYTMEGDG APGPNSTVLV HKKKDCPPGS FWWLIPLLLL 720 

LLPLLALLLL LCWKYCACCK ACLALLPCCN RGHMVGFKED HYMLRENLMA SDHLDTPMLR 780 

SGNLKGRDW RWKVTNNMQR PGFATHAASI NPTELVPYGL SLRLARLCTE NLLKPDTREC 840 

AQLRQEVEEN LNEVYRQISG VHKLQQTKFR QQPNAGKKQD HTIVDTVLMA PRSAKPALLK 900 

LTEKQVEQRA FHDLKVAPGY YTLTADQDAR GMVEFQEGVE LVDVRVPLFI RPEDDDEKQL 960 

LVEAIDVPAG TATLGRRLVN ITIIKEQARD WSFEQPEFS VSRGDQVARI PVIRRVLDGG 1020 

KSQVSYRTQD GTAQGNRDYI PVEGELLFQP GEAWKELQVK LLELQEVDSL LRGRQVRRFH 1080 

VQLSNPKFGA HLGQPHSTTI I IRDPDELDR SFTSQMLSSQ PPPHGDLGAP QNPNAKAAGS 1140 

RKIHFNWLPP SGKPMGYRVK YWIQGDSESE AHLLDSKVPS VELTKLYPYC DYEMKVCAYG 1200 

AQGEGPYSSL VSCRTHQEVP SEPGRLAFNV VSSTVTOLSW AEPAETNGEI TAYEVCYGLV 12 60 

NDDNRPIGPM KKVLVDNPKN RMLLIENLRE SQPYRYTVKA KNGAGWGPER EAIINLATQP 1320 

KRPMSIPIIP DIPIVDAQSG EDYDSFLMYS DDVLRSPSG3 QRPSVSDDTE HLVMGRMDFA 1380 

FPGSTNSLHR MTTTSAAAYG THLSPHVPHR VLSTSSTLTR DYNSLTR3EH SHSTTLPRDY 1440 

STLTSVSSHD SRLTAGVPDT PTRLVFSALG PTSLRVSWQE PRCERPLQGY SVEYQLLNGG 1500 

ELHRLNIPNP AQTSWVEDL LPNHSYVFRV RAQSQEGWGR ERSGVITIES QVKPQSPLCP 1560 

LPGSAFTLST PSAPGPLVFT ALSPDSLQLS WERPRRPNGD IVGYLVTCEM AQGGGPATAF 1620 

RVDGDSPESR LTVPGLSENV PYKFKVQART TEGFGPEREG IITIESQDGG PFPQLGSRAG 1680 

LFQHPLQSEY SSITTTHTSA TEPFLVDGPT LGAQHLEAGG SLTRHVTQEF VSRTLTTSGT 1740 
LSTHMDQQFF QT 
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GCACGAGGGC GCTTTTGTCT 
AGTAACCGAC TTTCCTCCGG 
CGGCTGTTCC CCCGGAGGGT 
GCAGAGGAGT AGGGTCCTTT 
GGTACTGACC CTACTCTCCA 
GAGCCCATCG CCTGGGACCT 
CAAGGGCCTT CCAGACCATC 
h ACACTGACCT 



AAGCTTCTGC 
TACAGCCGGC 
GAAGGCAGCT 



PCT/US02/12476 



3 AGTGATGGAG 



CTCATTGCCT GGGCAAGGCC 
CACTTTGGGA GGCTGAGGTG 
CAACATGGCG AAACCCCATC 
GGCCTGTAAT CCCAGTTCCT 
GAGGTTGCAG TGAACCGAGA 
\ AAAAAAAGAA 



CATCCAGAAG 
AGACATGTCC 
GAACTATTAC 



GGTGGATCAC 



AGATGGGAGT 
TAGGCCTTGA 
GGTTGCGGTG 
CTGAGGTCAG 
ATACAAAAGT 




TCCTCAGCAG TATGGCTCTG 
TGATAITTTC AACCCTACTT 
TATGCTCAAT TATTTGGTGT 
CAGTTGAAGA GGTTGTGTGG 
TTCTCATTTT ACATT TTAAA 
C CAAAGCCTGC 
A CTAATAAAGT 



TCGCACTGCT GTACCCAGCC T 

AAGAAAAAGC CTGTTTAATG CACAGGTGTG AGTGGATTGC 
GATCTCGCCC TTACCCCGGG GTCTGGTGTA TGCTGTGCTT 
ACATCTCTTA GAT3TCCCAA CTTCAGCTGT TGGGAGATGG 
CCTAAACATC TGTCTGGGGT TCCTTTAGTC TTGAATGTCT 
TGAGCCTCTC TTCCACAAGA GCTCCTCCAT GTTTGGATAG 
GTGGGCTGTT GGGAGTGAGG ATGGAGTGTT C 
GTCGTTCCTC CAACATAGTG TGTATTGGTC T 
TCAAGTTATG GACATTGTGG CCACCATGTG GCTTAAATGA 
GGAATATATA TTTCAAAAAA A, 



MKHVLNLYLL GWLTLLSIP VRVMESLEGL 



Seq ID NO, 564 DNA s 
Nucleic Acid A 
Coding s 



TTRSQLANTE PTKGLPDHPS 



TCAAAGCTTA 



TTTCGTTTTC ATGCTTTACC 
TTCTTAATTA GAGACAAGAA 
AGCCAGCCAC 



A TGGGGTTCAA 



AGAAAATCCA 
ACCTGTTTCA 
GAAATCAAAC 
TGACACGCAT 



AATGAATTTG 
TTGCTGAATG 
TTCTATCTCA 



ACACAATTGT 



TCAGTTTTGT 
GATCGCTATC 
ACGAAGGTTT 



CCTTTGGGGG 
GTGCTGGTGA 
AGGCAATTCA 



TATCTGTTTG 
ATGGTCAGCC 
TCAAATGGCA 
TTCTGATCGG 
TAAGTCAGTC 
TTTTTACCTG 
ACAGGC1TTT 
TCTTGTCTGC 
TTTCAAGAAG 
TGCAAAGTGT 
TTTATTGTTT 
TTAAAAAAAA 



GTGGATCTTC 
GGTTGCAGAC 
TGGACCTTGG 
CATGTATACT 
CAAGCCATTT 

AACAGAGGAC 



GACGGGCCAG 
CTTTATCTCA 
TTCCACATTA 



CTTTGCTTAC 
CAAATAACGA 
GAAAGAACAC 



TACTTCAAGT 
TCCATCGTGT 
GGGGACTCTC 
ATCATGGCTG 
AATATCCATG 
ACCTATGTGA 



ACTGCTCAAA 
ACAGCTGCTT 
GGTACATCCA 
ACCAGAGCAT 
GCAGAATTCC 
TCCTATATTA 
CAATAATTTA 
TCAGAACCAG 



CACCCTTCAC 
GGCAAGCATC 
CAGCTTCATA 
TCCATTTCGA 
CAGATACACT 
GATAAGCATT 
CATAACCTTC 
GCCAAACATC 



GTTTG 



GTGTAATGTT T< 
GCTGTTCAAA Ai 
GAGAAGATCG Gi 
GTTGGAATCG ATATGTACAA AGTGTAAATA AATGTTTCTT 1380 



CAGGGTTGTT 
TTTTACTTTT 
CTGCAAAGAA 

GAGTGAAAGC 



TTACACTGAT 1320 



I 



I 



MGFNLTLAKL PNNELHGQES HNSGNRSDGP GKNTTLHNEF DTIVLPVLYL IIFVASILLN 

GLAVWIPPHI PJJKTSFIFYL KNIWADLIM TLTFPFRIVH DAGFGPWYFK FILCRYTSVL 

FYANMYTSIV FLGLISIDRY LKWKPFGDS RMYSITFTKV LSVCVWVIKA VLSLPNIILT 

NGQPTEDNIH DCSKLKSPLG VKWHTAVTYV NSCLFVAVLV ILIGCYIAIS RYIHKSSRQF 

ISQSSRKRKH NQSIRVWAV FFTCFLPYHL CRIPFTFSHL DRLLDESAQK ILYYCKEITL 

FLSACNVCLD PIIYFFMCRS FSRRLFKKSN IRTRSESIRS LQSVRRSSVR IYYDYTDV 

Seq ID NO; 566 DNA sequence 

Nucleic Acid Accession #: NM_005365.1 

Coding sequence: 1..948 



I 



G AGCAGAGGAG TCCGCACTGC AAGCCTGATG AAGACCTTGA AGCCCAAGGA 
GAGGACTTGG GCCTGATGGG TGCACAGGAA CCCACAGGCG AGGAGGAGGA GACTACCTCC 
TCCTCIGACA GCAAGGAGGA GGAGGTGTCT GCTGCIGGGT CATCAAGTCC TCCCCAGAGT 
CCTCAGGGAG GCGCTTCCTC CTCCATTTCC GTCTACTACA CTTTATGGAG CCAATTCGAT 
GAGGGCTCCA GCAGTCAAGA AGAGGAAGAG CCAAGCTCCT CGGTCGACCC AGCTCAGCTG 
GAGTTCATGT TCCAAGAAGC ACTGAAATTG AAGGTGGCTG AGTTGGTTCA TTTCCTGCTC 
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CACAAATATC GAGTCAAGGA GCCGGTCACA AAGGCAGAAA TGCTGGAGAG CGTCATCAAA 420 

TACTTTCC IGTGATCTTC GGCAAAGCCT CCGAGTTCAT GCAGGTGATC 480 

3 ATGIGAAGGA GGTGGACCCC GCCGGCCACT CCTACATCCT TGTCACTGCT 540 

CTTGGCCTCT CGTGCGATAG CATGCTGGGT GATGGTCATA GCATGCCCAA G3CCGCCCTC 600 

CTGATCATTG TCCIGGGTGT GATCCTAACC AAAGACAACT GCGCCCCTGA AGAGGTTATC 6S0 

TGGGAAGCGT TGAGTGTGAT GGGGGTGTAT GTTGGGAAGG AGCACATGTT CTACGGGGAG 720 

CCCAGGAAGC TGCTCACCCA AGATTGGGTG CAGGAAAACT ACCTGGAGTA CCGGCAGGTG 780 

CCCGGCAGTG ATCCTGCGCA CTACGAGTTC CTGTGGGGTT CCAAGGCCCA CGCTGAAACC 840 

AGCTATGAGA AGGTCATAAA TTATTTGGTC ATGCTCAATG CAAGAGAGCC CATCTGCTAC 900 
CCATCCCTTT ATGAAGAGGT TTTGGGAGAG GAGCAAGAGG GAGTCTGA 

1 11 21 31 41 51 

I I I I I I 

MSLEQRSPHC KPDEDLEAQG EDLGLMGAQE PTGEEEETTS SSD3KEEEVS AAGSSSPPQS SO 

PQGGASSSIS VYYTLWSQFD EGSSSQEEEE PSSSVDPAQL EFMFQEALKL KVAELVHPLL 120 

HKYRVKEPVT KAEMLESVIK NYKRYFPVIF GKASEFMQVI FGTDVKEVDP AGHSYILVTA 180 

LGLSCDSMLG DGHSMPKAAL LIIVLGVILT KDNCAPEEVI KEALSVMGVY VGKEHMFYGE 240 

PRKLLTQDWV QENYLEYRQV PGSDPAHYEF LWGSKAHAET SYEKVINYLV MLNAREPICY 300 
PSLYEEVLGE EQEGV 

Seq ID NO: 558 DNA sequence 
Nucleic Acid Accession #: NM_014400 
Coding sequence: 86.. 1126 

1 11 21 31 41 51 

I I I I I I 

GGTTACTCAT CCIGGGCTCA GGTAAGAGGG CCCGAGCTCG GAGGCGGCAC ACCCAGGGGG 60 
GACGCCAAGG GAGCAGGACG GAGCCATGGA CCCCGCCAGG AAAGCAGGTG CCCAGGCCAT 120 
GATCTGGACT GCAGGCTGGC TGCTGCTGCT GCTGCTTCGC GGAGGAGCGC AGGCCCTGGA 180 
GTGCTACAGC TGCGTGCAGA AAGCAGATGA CGGATGCTCC CCGAACAAGA TGAAGACAGT 240 
GAAGTGCGCG CCGGGCGTGG ACGTCTGCAC CGAGGCCGTG GGGGCGGTGG AGACCATCCA 300 
CGGACAATTC TCGCTGGCAG TGCSGGGTTG CGGTTCGGGA CTCCCCGGCA AGAATGACCG 360 
CGGCCTGGAT CTTCACGGGC TTCTGGCGTT CATCCAGCTG CAGCAATGCG CTCAGGATCG 420 
CTGCAACGCC AAGCTCAACC TCACCTCGCG GGCGCTCGAC CCGGCAGGTA ATGAGAGTGC 480 
ATACCCGCCC AACGGCGTGG AGTGCTACAG CTGTGTGGGC CTGAGCCGGG AGGCGTGCCA 540 
GGGTACATCG CCGCCGGTCG TGAGCTGCTA CAACGCCAGC GATCATGTCT ACAAGGGCTG 600 
CTTCGACGGC AACGTCACCT TGACGGCAGC TAATGTGACT GTGTCCTTGC CTGTCCGGGG 660 
CTGTGTCCAG GATGAATTCT GCACTCGGGA TGGAGTAACA GGCCCAGGGT TCACGCTCAG 720 
TGGCTCCTGT TGCCAGGGGT CCCGCTGTAA CTCTGACCTC CGCAACAAGA CCTACTTCTC 780 
CCCTCGAATC CCACCCCTTG TCCGGCTGCC CCCTCCAGAG CCCACGACTG TGGCCTCAAC 840 
CACATCTGTC ACCACTTCTA CCTCGGCCCC A 
GCCAGCGCCA ACCAGTCAGA CTCCGAGACA G 

GGAGCCCAGG TTGACTGGAG GCGCCGCTGG CCACCAGGAC CGCAGCAATT CAGGGCAGTA 
TCCTGCAAAA GGGGGGCCCC AGCAGCCCCA TAATAAAGGC TGTGTGGCTC CCACAGCTGG 
ATTGGCAGCC CTTCTGTTGG CCGTGGCTGC TGGTGTCCTA CTGTGAGCTT CTCCACCTGG 
AAATTTCCCT CTCACCTACT TCTCTGGCCC TGGGTACCCC TCTTCTCATC ACTTCCTGTT 
CCCACCACTG GACTGGGCTG GCCCAGCCCC TGTTTTTCCA ACATTCCCCA GTATCCCCAG 
CTTCTGCTGC GCTGGTTTGC GGCTTTGGGA AATAAAATAC CGTTGTATAT ATTCTGGCAG 
GGCTGTTCTA GCTTTTTGAG GACAGCTCCT GTATCCTTCT CATCCTTGTC ICTCCGC ■ 
TCCTCTTGTG ATGTTAGGAC AGAGTGAGAG AAGTCAGCTG TCACGGGGAA 



GGTGGGACAA TGGCTCCCCA CTCTAAGCAC T 
ATCGGTTCCC CATATGTCTT CCTTACTAGA CTGTGAGCTC CTCGAGGGCA GGGACCGTGC 1S20 
CTTATGTCTG TGTGTGATCA GTTTCTGGCA CATAAATGCC TCAATAAAGA TTTAATTACT 1S80 
TTGTATAGTG AAAAAAAA 



I 1 I I I 1 

MDPARKAGAQ AMIWTAGWLL LLLLRGGAQA LECYSCVQKA DDGCSPNKNK TVKCAPGVDV 
CTEAVGAVET 1HGQFSLAVX GCGSGLPGKN DRGLDLHGLL AFIQLQQCAQ DRCNAKLNLT 
SRALDPAGNE SAYPPNGVEC YSCVGLSREA CQGTSPPWS CYNASDHVYK GCFDGNVTLT 
AANVTVSLPV RGCVQDEFCT RDGVTGPGFT LSGSCCQGSR CNSDLRNKTY FSPRIPPLVR 
LPPPEPTTVA STTSVTTSTS APVRPTSTTK PMPAPTSQTP RQGVEHEASR DEEPRLTGGA 
AGHQDRSNSG QYPAKGGPQQ PHNKGCVAPT AGLAALLLAV AAGVLL 

Seq ID NO: 570 DNA sequence 

Nucleic Acid Accession #: NM_005329.1 

Coding sequence: 1..1662 



ATGCCGGTGC AGCTGACGAC AGCCCTGCGT GTGGTGGGCA CCAGCCTGTT TGCCCTGGCA 

GTGCTGGGTG GCATCCTGGC AGCCTATGTG ACGGGCTACC AGTTCATCCA CACGGAAAAG 

CACTACCTGT CCTTCGGCCT GTACGGCGCC ATCCTGGGCC 7GCACCTGC7 CATTCAGAGC 

CTTTTTGCCT TCCTGGAGCA CCGGCGCATG CGACGTGCC3 GCCAGGCCCT GAACCTCCCC 

TCCCCGCGGC GGGGCTCGGT GGCACTGTGC ATTGCCGCGT ACCAGGAGGA CCCTGACTAC 

TTGCGCAAGT GCCTGCGCTC GGCCCAGCGC ATCTCCTTCC CTGACCTCAA GGTGGTCATG 

GTGGTGGATG GCAACCGCCA GGAGGACGCC TACATGCTGG ACATCTTCCA CGAGGTGCTG 

GGCGGCACCG AGCAGGCCGG CTTCTTTGTG TGGCGCAGCA ACTTCCATGA GGCAGGCGAG 

GGTGAGACGG AGGCCAGCCT GCAGGAGGGC ATGGACCGTG TGCGGGATGT GGTGCGGGi i 

AGCACCTTCT CGTGCATCAT GCAGAAGTGG GGAGGCAAGC GCGAGGTCAT GTACACGGCC 
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TTCAAGGCCC TCGGCGATTC GGTGGACTAC ATCCAGGTGT GCGACTCTGA CACTGTGCTG 660 
GATCCAGCCT GCACCATCGA GATGCTTCGA GTCCTGGAGG AGGATCCCCA AGTAGGGGGA 720 
GTCGGGGGAG ATGTCCAGAT CCTCAACAAG TACGACTCAT GGATTTCCTT CCTGAGCAGC 780 
GTGCGGTACT GGATGGCCTT CAACGTGGAG CGGGCCTGCC AGTCCTACTT TGGCTGTGTG 840 
CAGTGTATTA GTGGGCCCTT GGGCATGTAC CGCAACAGCC TCCTCCAGCA GTTCCTGGAG 900 
GACTGGTACC ATCAGAAGTT CCTAGGCAQC AAGTGCAGCT TCGGGGATGA CCGGCACCTC 950 

ACCAACCGAG TCCTGAGCCT TGGCTACCGA ACTAAGTATA CCGCGCGCTC CAAGTGCCTC 1020 

ACAGAGACCC CCACTAAGTA CCTCCGGTGG CTCAACCAGC AAACCCGCTG GAGCAAGTCT 1080 

TACTTCCGGG AGTGGCTCTA CAACTCTCTG TGGTTCCATA AGCACCACCT CTGGATGACC 1140 

TACGAGTCAG TGGTCACGGG T.TTCTTCCCC TTCTTCCTCA TTGCCACGGT TATACAGCTT 12 00 

TTCTACCGGG GCCGCATCTG GAACATTCTC CTCTTCCTGC TGACGGTGCA GCTGGTGGGC 12 SO 

ATTATCAAGG CCACCTACGC CTGCTTCCTT CGGGGCAATG CAGAGATGAT CTTCATGTCC 1320 

CTCTACTCCC TCCTCTATAT GTCCAGCCTT CTGCCGGCCA AGATCTTTGC CATTGCTACC 13 80 

ATCAACAAAT CTGGCTGGGG CACCTCTGGC CGAAAAACCA TTGTGGTGAA CTTCATTGGC 1440 

CTCATTCCTG TGTCCATCTG GGTGGCAGTT CTCCTGGGAG GGCTGGCCTA CACAGCTTAT 1500 

TGCCAGGACC TGTTCAGTGA GACAGAGCTA GCCTTCCTTG TCTCTGGGGC "ATACTGTAT 1SS0 

GGCTGCTACT GGGTGGCCCT CCTCATGCTA TATCTGGCCA TCATCGCCCG GCGATGTGGG 1S20 
AAGAAGCCGG AGCAGTACAG CTTGGCTTTT GCTGAGGTGT GA 

Seq ID NO: 571 Protein 



MPVQLTTALR WGTSLFALA VLGGILAAYV TGYQFIHTEK HYLSFGLYGA ILGLHLLIQS 
LFAFLEHERM RRAGQALKLP SPRRGSVALC IAAYQEDPDY LRKCLRSAQR ISFPDLKWM 
WDGNRQEDA YMLDIFHEVL GGTEQAGFFV WRSNFHEAGE GETEASLQSG MDRVRDWRA 
STFSCIMQKW GGKREVMYTA FKALGDSVDY IQVCDSDTVL DPACTIEMLR VLEEDPQVGG 
VGGDVQILNK YDSWISFLSS VRYWMAFNVE RACQSYFGCV QCISGPLGMY RNSLLQQFLE 
DWYHQKFLGS KCSFGDDRHL TNRVLSLGYR TKYTARSKCL TETPTKYLRW LNQQTRWSKS 
r YESWTGFFP FFLIATVIQL FYRGRIWNIL LFLLTVQLVG 
L RGNAEMIFMS LYSLLYMSSL LPAKIFAIAT INKSGWGTSG RKTIWNFIG 
LIPVSIWVAV LLGGLAYTAY CODLFSETEL AFLVSGAILY GCYWVALLML YLAI I ARRCG 
KKPEQYSLAF AEV 

Seq ID NO: 572 DNA sequence 
Nucleic Acid Accession ft: Eos sequence 
Coding sequence: 148-70S5 

n n 41 51 

3 CACGCACGAT CTCACTTCGA TCTATACACT GGAGGATTAA AACAAACAAA 
C CTCTCCACTC TGAGAAGCAG AGGAGCCGCA 
G CGAATCCTAA AGCGTTTCCT CGCTTGCATT 
CAGCTCCTCT GTGTTTGCCG CCTGGATTGG GCTAATGGAT ACTACAGACA ACAGAGAAAA 
CTTGTTGAAG AGATTGGCTG GTCCTATACA GGAGCACTGA ATCAAAAAAA TTGGGGAAAG 
AAATATCCAA CATGTAATAG CCCAAAACAA TCTCCTATCA ATATTGATGA AGATCTTACA 
CAAGTAAATG TGAATCTTAA GAAACTTAAA TTTCAGGGTT GGGATAAAAC ATCATTGGAA 
AACACATTCA TTCATAACAC TGGGAAAACA GTGGAAATTA ATCTCACTAA TGACTACCGT 
GTCAGCGGAG GAGTTTCAGA AATGGTGTTT AAAGCAAGCA AGATAAC 
AAATGCAATA TGTCATCTGA TGGATCAGAG CATAGTTTAG AAGGACAAAA A 
GAGATGCAAA TCTACTGCTT TGATGCGGAC CGATTTTCAA GTTTTGAGGA AGCAGTCAAA 
GGAAAAGGGA AGTTAAGAGC TTTATCCATT TTGTTTGAGG TTGGGACAGA AGAAAATTTG 
GATTTCAAAG CGATTATTGA TGGAGTCGAA AGTGTTAGTC GTTTTGGGAA GCAGGCTGCT 
TTAGATCCAT TCATACTGTT GAACCTTCTG CCAAACTCAA CTGACAAGTA T 
AATGGCTCAT TGACATCTCC TCCCTGCACA GACACAGTTG ACTGGATTGT T 
ACAGTTAGCA TCTCTGAAAG CCAGTTGGCT GTTTTTTGTG AAGTTCTTAC A 

TCTGGTTATG TCATGCTGAT GGACTACTTA CAAAACAATT TTCGAGAGCA ACAGTACAAG 1020 

TTCTCTAGAC AGGTGTTTTC CTCATACACT GGAAAGGAAG AGATTCATGA AGCAGTTTGT 1080 

AGTTCAGAAC CAGAAAATGT T CAGGCTG AC CCAGAGAATT ATACCAGCCT TCTTGTTACA 1140 

TGGGAAAGAC CTCGAGTCGT TTATGATACC ATGATTGAGA AGTTTGCAGT TTTGTACCAG 1200 

CAGTTGGATG GAGAGGACCA AACCAAGCAT GAATTTTTGA CAGATGGCTA TCAAGACTTG 1260 

GGTGCTATTC TCAATAATTT GCTACCCAAT ATGAGTTATG TTCTTCAGAT AGTAGCCATA 132 0 

TGCACTAATG GCTTATATGG AAAATACAGC GACCAACTGA TTGTCGACAT GCCTACTGAT 1380 

AATCCTGAAC TTGATCTTTT CCCTGAATTA ATTGGAACTG AAGAAATAAT CAAGGAGGAG 1440 

GAAGAGGGAA AAGACATTGA AGAAGGCGCT ATTGTGAATC CTGGTAGAGA CAGTGCTACA 1500 

AACCAAATCA GGAAAAAGGA ACCCCAGATT TCTACCACAA CACACTACAA TCGCATAGGG 1560 

ACGAAATACA ATGAAGCCAA GACTAACCGA TCCCCAACAA GAGGAAGTGA ATTCTCTGGA 1620 

AAGGGTGATG TTCCCAATAC ATCTTTAAAT TCCACTTCCC AACCAGTCAC TAAATTAGCC 1680 

ACAGAAAAAG ATATTTCCTT GACTTCTCAG ACTGTGACTG AACTGCCACC TCACACTGTG 1740 

GAAGGTACTT CAGCCTCTTT AAATGATGGC TCTAAAACTG TTCTTAGATC TCCACATATG 1800 

G GGACTGCAGA ATCCTTAAAT ACAGTTTCTA TAACAGAATA TGAGGAGGAG 1860 

A GCTTGATACT GGAGCTGAAG ATTCTTCAGG CTCCAGTCCC 1920 

GCAACTTCTG CTATCCCATT CATCTCTGAG AACATATCCC AAGGGTATAT ATTTTCCTCC 1980 

: ATATGATGTC CTTATACCAG AATCTGC7AG AAATGCTTCC 204 0 

A CTTCATCAGG TTCAGAAGAA TCACTAAAGG ATCCTTCTAT GGAGGGAAAT 210 0 

GTGTGGTTTC CTAGCTCTAC AGACATAACA GCACAGCCCG ATGTTGGATC AGGCAGAGAG 2160 

AGCTTTCTCC AGACTAATTA CACTGAGATA CGTGTTGATG AATCTGAGAA GACAACCAAG 2220 

TCCTTTTCTG CAGGCCCAGT GATGTCACAG GGTCCCTCAG TTACAGATCT GGAAATGCCA 22 80 

CATTATTCTA CCTTTGCCTA CTTCCCAACT GAGGTAACAC CTCATGCTTT TACCCCATCC 2340 

C AGGATTTGGT CTCCACGGTC AACGTGGTAT ACTCGCAGAC AACCCAACCG 2400 

3 GTGAGACACC TCTTCAACCT TCCTACAGTA GTGAAGTCTT TCCTCTAGTC 2460 

T TGCTTGACAA TCAGATCCTC AACACTACCC CTGCTGC7TC AAGTAGTGAT 2520 

TCGGCCTTGC ATGCTACGCC TGTATTTCCC AGTGTCGATG TGTCATTTGA A7CCA7CCTG 25 80 

TCTTCCTATG ATGGTGCACC TTTGCTTCCA TTTTCCTCTG CTTCCTTCAG TAGTGAATTG 2640 

TTTCGCCATC TGCATACAGT TTCTCAAATC CTTCCACAAG TTACTTCAGC TACCGAGAGT 2700 

GATAAGGTGC CCTTGCATGC TTCTCTGCCA GTGGCTGGGG GTGATTTGCT ATTAGAGCCC 2760 
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AGCCTTGCTC AGTATTCTGA 
3 AATCTGGTGT 



TGTGCTGTCC ACTACTCATG CTGCTTCAGA GACGCTGGAA 
TCTTTATAAA ACGCTTATGT TTTCTCAAGT TGAACCACCC 
TGCACGTTCT TCAGGGCCTG AACCTTCTTA TGCCTTGTCT 



PCT/US02/12476 



TCAGGGTTCC TTATTTAGCG GCCCTAGCCA 
AGCCTACTCA 
TTCTTTTACC 
CTGAATTTAC 



ACTGAACTGC 



AGCCTCTTCT GATAGTGAAT 
TTCTTCACCT GTTTCTGTAG 
TAAGGCGCTT TCTAAAAGTG 
TTTCAATGAG ATGGTTTACC 
ATGATAATGT AAATAAGTTG AATGCGTCTT 
TCCCTTGCTC 



TCTCAAGCAT CTGGTGACAC 
TCCTCTGACC CTGCTTCTAG 
T CTTTTAGTAC 



CCCAAAGTTG ATAAAATTAG 



ATGCACTCTG CTTCACTTCA 
GTTTTGTTAA AAAGTGAAAG 
TTGTTCCAAA CGGCCAATTT 
TTTGCTACAC CTGTTTTATC 
CATTCCGATG AAATTTTAAC 



CAGGAAAAGG 



TCAGACAGTC 
TCCCAAAAGC 
CCTCTCAGCC 
GGGCAAGGTA 
GACACTAATG 



CCATTACAGC 
CTTCTAAGGC 
GTGGTGAAGA 
ATGGCTTATC 
TAATGAATGA 
ACTCACTATC 
AAACTGGTAT 
ACAATGATGG 
CTGAATCTAA 
CCTCAGATAG 
AAAAAGATGC 
AGTCCCCAAC 



AGGTTTGACC ATTTCCTATG 
TTCCCACCAA GTGGTACCTT 
GGAGATTAAC CAGGCCCATC 
AATTGATGAA CCATTAAATA 
CTCCACCAAA AGTTCTGTTA 
TACATTTGTA TCTACTGATC 
TGTTTCTCCC CACAGAGATG 
AACTTCTGAG 



CATTCATAAG T 
TTCAGACACC C 
TGAGAATTCT G 



AGCATGGGCA G 



TGTATGTCAT 
CACGAAAACA 
GAAGAAGATA 
CCTGGTAAAT 
AATGACATTC 
GTTCTGACAA 
AATGAGACTT 
CTGGCAGCAG 
ATCATCTGTT ACTAGCGAGA 
GAGTCTCGTA 



GTGCCAAATC 
GTGATGATGA 
GCTCATCCTA 
GTCTTATGGA 
ATAGAGTCAC 
CACCATCAGC 



GTGATGAAGA 
CCACAGATTT 
GTGACTCAGA 
ACTCAGAAGT 



TGATGCCGGT 
TGATGATGAC 
TAGAGAATCA 
TCAGAATAAT 
AAGTGTATCC 
AAATGGGCTA 
TGCTCTGCTT 
AAGTGGATCA 
CAGTTTTGCA 
AATAACTCCT 
GTTCCACGTT 



C TGGAGGAAA 



C AATTCCAATA 



AGGTATTACA 
TATCGTTGCC 
ACTGACTGAT 
TGCTGCCCAA 
TAATGTGGAA 
A TCAGTACTGG 



ATATCCACAC 
AAGCACTTTC 
GAGACACTGA 



CTCCAACACC 
CAAAGCATGT 
AAGAGTTTTi> 



CTAAGAAACA 
ACACAGTAT C 
CTGACCTTTG 
CACTGCAGTG 
CAGATTCAAC 
AGAAATTATT 
GCCATACTTA 
CTCCTCATTC 
CAGTCAAATA 
AATCGAACTT 



CAAAAATAAA 
ACTACACG CA 
TGAGAAAGGC 
CTGGAGTTGG 
ACGAAGGAAC 
TGGTACAAAC 
GTAAAGAAAC 
CTGGACCAGC 



AGCCTATGCC 



TGTCAACATA 
TGAGGAGCAA 
TGAGGTGCTG 



TTCATCATTA 



CTTCTATCAT 
CAGACTACAT 
CCCAGCACCC 



TATGATCATA 
TATATCAATG 
GGCCCACTGA 
GTTATTGTCA 
CCTGCCGATG 
CTTGCCTATT 
CAGAAAGGAA 
ATGGGAGTAC 
AAGCGCCATG 
ACATATATTG 
TTTGGCTTCT 
TATGTCTTCA 
GACAGTCATA 
AAGCTAGAGA 
GCAGCCCTAA 



GCAGGGTTAA 
CCAATTATGT 
AATCCACAGC 
TGATAACAAA 
GGAGTGAGGA 
ATACTGTGAG 
GACCCAGTGG 
CAGAGTACTC 



CAGTGGGGCC T> 
TGCTAGACAG 
TAAAACACAT 
TTCATGATAC 



A GAG CAATGAA 



GACCATAATG CCCAACTGGT GGTTATGATT C 



AGATGAGCCT 
TCTATCTAAT 
TTATGTACTT 



CCAAATCCAG ATAGCCCCAT TAGTAAA 



GCTGCCAATA GGGATGGGCC TATGATTGTT 
CTCTGACAAC CCTTATGCAC 
CCAAGATGAT CAAT CTGATG 
TCTACAAAGT GATCCTCAGC 
TGGACAGTAA TGGTGCAGCA 
TTTAACACAG AAAGGGGTGG 
TAGGCAGGAA AATCAGTCTA 
AGGATTCTGC 



TACCAGGTAG 
TATCAGTTTC 
TCCACCTCTC 



TTTGAACTTA 
CATGATGAGC 
CAACTAGAAA 



AGAGCTTTAA 
TTATAATTCA 
ACTTTCAG7G 



TTCCTAAAAT 



TATTTCTAAG AATGGAATTG 
TTTATAGAGG TTAGGAATTC 
GCTGTATTTG TAGCAAT TAT 
AAATAAAACA CTCTTCCATA 
C TGTTACTTAT 



CAAACTACAG 
TGATATTCAA 



CTTGTGAGCA 
TTGCCTGATG 
GGGGACTCAC 
GTTCTGTTAT 
CGCCAAATTT 
TGTTTGAACT 
TTCTGTATTG 



ATGGAGGAGT 
AAGAAAATTC 
TCTTTGCTGA 
CAAGGCAGGA 
GAAATATAGC 
ATCTGAGCAT 



AGAAGATGAA 
GGTCACTCTT 
GGACTT7ATC 
TCCTAAATGG 
AAAAGAAGAA 



CGTGGATGTT 



CATTTTACAA 
GCCCTAGTGT 
CTGAGTCAAG 



CTTTTAATAC 
CTGCAGTATT 
CTCCATGGAC 



TAAGTCATTA A 



TTGCAAAAAT 



ACCAGTTTTC T 
TTTAACTTTT G 
ACCTTACCAA A 



TGGAAATGAG 




AAGGCATGTA 
TAAGCTTATA 
ATTTGCTGGT 



4320 
•13 80 

4560 
4620 
4680 
4740 
4800 
4860 
4920 
4380 
5040 



TATCTTTCCA 
TGCAGATTTA 
CCAGGAAGTG 
AGACAACAAG 



TGAAGATTTC 
GTACGGGAAC 



6420 
6480 
6540 
S600 
6660 
6720 
6780 
6840 
6900 



A CAATGTGTGC 
3 AATTTTACAG 
3 AAAATTT CAA 
T CAAATT7TTA 
AGTAGCCTGT 
CACCTAAAGT 
CAAATTTATA 
CTGTGTAATT 



7200 
7260 
7320 
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IQLLCVCRLD WANGYYRQQR 
TQVNVNLKKL KFQGWDKTSL 
GKCNMSSDGS EHSLEGQKFP 



I 



DTVSISESQL AVPCEVLTMQ 



GSKTVLRSPH 
ENISCGYIPS 
TAQPDVGSGR 
TEVTPHAFTP 
LNTTPAASSS 
ILPQVTSATE 



LGAILNNLLP NMSYVLQIVA 
AIVNPGRDSA 
NSTSQPVTKL 
MMLSGTAESL NTVSITEYEE 



I 

" TGALNQKNWG K 
TVEINLTNDY R 
LEMQIYCFDA DRFSSFESAV KGKGKLRALS 
ALDPFILLNL LPNSTDKYYI YNGSLTSPPC 
QSGYVMLMDY LQNNFREQQY KFSRQVFSSY 
TMIEKFAVLY QQLDGEDQTK 
SDQLIVDMPT DNPELDLFPE 
TNQIRKKEPQ ISTTTKYNRI GTKYNEAKTN 
QTVTELPPHT VEGTSASLND 



ESFLQTNYTE 
SSRQQDLVST 
DSALHATPVF 



SLFSGPSHIP IPKSSLITPT 
PVSVAEFTYT 
LNASLQETSV 
KPVLSANSEP 
AVPSDPILVE 
TISYASEKYE 
EPLNTLINKL 



IRVDESEKTT 
VNWYSQTTQ 
PSVDVSFESI 
PVAGGDLLLE 
SSGPEPSYAL 
ASLLQPTHAL 
LSKSEI IYGN 
SISSTKGMFP GSLAHTTTKV 
ASSDPASSEH LSPSTQLLFY 



SEDSTSSGSE 
PVYNGETPLQ 
PSLAQYSDVL 



TPKVDKISST 
PVLLKSESSH 
IHSDEILTST 



SGDGEWSGAS 
ETELQIPSFN 
FDHEISQVPE 
ETSASFSTEV 



ESLKDPSMEG NVWFPSSTDI 
QGPSVTDLEM PHYSTFAYFP 
PSYSSEVFPL VTPLLLDNQI 
PFSSASFSSE LFRHLHTVSQ 
STTHAASETL EFGSESGVLY 
TVSYSSAIPV HDSVGVTYQG 
SDSSFLLPDT DGLTALNISS 



KCMSCSSYRE SQEKVMNDSD 
SPGKSPSANG LSQKHNDGKE 



TADSSNHPDN 
OGPLKSTAED 
VLAYYTVRNF 



NEEKLIIQDF 
VHDEHGGVTA 
SLVSTRQEEN 



LESEKKAVIP 
PISDDVGAIP 
KHKNRYINIV 
FWRMIWEHNV 
TLRNTKIKKG 
VHCSAGVGRT 
EAILSKETEV 
KNRTSSIIPV 
WDHNAQLVVM 
I LEATQDDYV 
GTFCALTTLM 
PSTSLDSNGA 



THENSLMDQN 
ENDIQTGSAL 
ILAAGDSEIT 
LVIVSALTFI 



NPISYSLSEN 
LPLSPESKAW 
PGFPQSPTSS 
CLWLVGILI 
LHASSGFTEE 



GTYIVLDSML 
LDSHIHAYVN 
ERSRVGISSL 
I PDGQNMAED 
LEVRHFQCPK 
HQLEKENSVD 



EKGRRKCDQY 
VTQYHYTQWP 
QQIQHEGTVN 
ALLIPGPAGK 
SGEGTDYINA 



NNFSVQPTHT VSOASGDTSL 
LLQPSFQASD VDTLLKTVLP 
VPVFDVSPTE HMHSASLQGL 
NQAHPPKGRH VFATPVLSID 
NGHVAITAVS 
DRGSDGLSIH 
SSDSQTGMDR 
SGQGTSDSLN 

VTSENSEVFH 
YWRKCFQTAK 
FETLKEFYQE 
DYINANYVDG 
WPADGSEEYG 



TKLEKQFQLL 
SYIMGYYQSN 
PINCESFKVT 
TFELISVIKE 
MRPGVFADIB 



NFLVTQKSVQ 
VLTFVRKAAY 
QRKYLVQTEE 



Seg ID NOi S74 
Nucleic Acid A 
Coding sequenc 



2100 
2160 
2220 
2230 



ATTTCCTTCG 



CACACATACG 
CAAAAAAAAC 
CGGCGAGGGG 
CAGCTCCTCT GTGTTTGCCG CCTGGATTGG 
CTTGTTGAAG AGATTGGCTG GT OCT ATACA 
AAATATCCAA CATGTAATAG CCCAAAACAA 
CAAGTAAATG TGAATCTTAA GAAACTTAAA 
AACACATTCA TTCATAACAC TGGGAAAACA 
GTCAGCGGAG GAGTTTCAGA AATGGTGTTT 
AAATGCAATA TGTCATCTGA TGG AT CAG AG 
GAGATGCAAA 



TCTATACACT 
CTCTCCACTC 
CGAATCCTAA 
GCTAATGGAT 



A AACAAACAAA 



TCTCCTATCA 
TTTCAGGGTT 
GTGGAAATTA 
AAAGCAAGCA 



ACTACAGACA 
ATCAAAAAAA 
ATATTGATGA 
GGGATAAAAC 
ATCTCACTAA 



CGCTTGCATT 
ACAGAGAAAA 
TTGGGGAAAG 
AGATCTTACA 
ATCATTGGAA 
TGACTACCGT 
TCACTGGGGA 



GGAAAAGGGA 
GATTTCAAAG 
TTAGATCCAT 
AATGGCTCAT 
ACAGTTAGCA 
TCTGGTTATG 
TTCTCTAGAC 
AGTTCAGAAC 
TGGGAAAGAC 



:tgctt tgatgcggac 
agttaagagc tttatccatt 
cgattattga tggagtcgaa 
tcatactgt' 



TGACATCTCC T 



CGATTTTCAA 
TTGTTTGAGG 
AGTGTTAGTC 
CCAAACICAA 



TCTCTGAAAG CCAGTTGGCT 
TCATGCTGAT GGACTACTTA 
AGGTGTTTTC CTCATACACT 



GGTGCTATTC 
TGCACTAATG 
AAT CCTGAAC 



ACGAAATACA 



GGAAAAAGGA 
ATGAAGCCAA 
TTCCCAATAC 



TTATGATACC 
AACCAAGCAT 
GCTACCCAAT 
AAAATACAGC 
CCCTGAATTA 
AGAAGGCGCT 



CCAGAGAATT 
ATGATTGAGA 

ATGAGTTATG 
GACCAACTGA 
ATTGGAACTG 



G7TTTGAGGA 
TTGGGACAGA 
GTTTTGGGAA 
CTGACAAGTA 
ACTGGATTGT 
AAGTTCTTAC 
TTCGAGAGCA 
AGATTCATGA 
A7ACCAGCCT 
AGTTTGCAGT 
CAGATGGCTA 
TTCTTCAGAT 
TTGTCGACAT 



AGCAGTCAAA 
AGAAAATTTG 
GCAGGCTGCT 
TTACATTTAC 
TTTTAAAGAT 



ACAGTACAAG 
AGCAGTTTGT 
TCTTGTTACA 
TTTGTACCAG 
TCAAGACTTG 
AGTAGCCATA 



GACTAACCGA 



CAGCCTCTTT 
GGACTG CAG A 
CCAGTTTCAA 



GACTTCTCAG 
AAATGATGGC 
ATCCTTAAAT 
GCTTGATACT 



TCTACCACAA 
TCCCCAACAA 
TCCACTTCCC 
ACTGTGACTG 



CTGGTAGAGA 



CAAGGAGGAG 



AACCAGTCAC TAAATTAGCC 
AACTGCCACC TCACACTGTG 
TTCTTAGATC TCCACATATG 
TAACAGAATA TGAGGAGGAG 



AAGGGTATAT ATTTTCCTCC 
AATCTGCTAG AAATGCTTCC 
TCACTAAAGG ATCCTTCTAT GGAGGGAAAT 
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GTGTGGTTTC CTAGCTCTAC AGACATAACA GCACAGCCCG ATGTTGGATC AGGCAGAGAG 2160 

AGCTTTCTCC AGACTAATTA CACTGAGATA CGTGTTGATG AATCTGAGAA GACAACCAAG 2220 

TCCTTTTCTG CAGGCCCAGT GATGTCACAG GGTCCCTCAG TTACAGATCT GGAAATGCCA 2280 

CATTATTCTA CCTTTGCCTA CTTCCCAACT GAGGTAACAC CTCACGCTTT TACCCCATCC 2340 

TCCAGACAAC AGGATTTGGT CTCCACGGTC AACGTGGTAT ACTCGCAGAC AACCCAACCG 2400 

GTATACAATG CAGAGGCCAG TAATAGTAGC CATGAG7CTC GTAT7GGTCT AGCTGAGGGG 2460 

TTGGAATCCG AGAAGAAGGC AGTTATACCC CTTGTGATCG TGTCAGCCCT GACTTTTATC 252 0 

TGTCTAGTGG TTCTTGTGGG TATTCTCATC TACTGGAGGA AATGCTTCCA GACTGCACAC 2S80 

TTTTACTTAG AGGACAGTAC ATCCCCTAGA GTTATATCCA CACCTCCAAC ACCTATCTTT 2640 

CCAATTTCAG ATGATGTCGG AGCAATTCCA ATAAAGCACT TTCCAAAGCA TGTTGCAGAT 2700 

TTACATGCAA GTAGTGGGTT TACTGAAGAA TTTGAGACAC TGAAAGAGTT TTACCAGGAA 2760 

GTGCAGAGCT GTACTGTTGA CTTAGGTATT ACAGCAGACA GCTCCAACCA CCCAGACAAC 2820 

AAGCACAAGA ATCGATACAT AAATATCGTT GCCTATGATC ATAGCAGGGT TAAGCTAGCA 2880 

CAGCTTGCTG AAAAGGATGG CAAACTGACT GATTATATCA ATGCCAATTA TGTTGATGGC 2940 

TACAACAGAC CAAAAGCTTA TATTGCTGCC CAAGGCCCAC 7GAAATCCAC AGCTGAAGAT 3000 

TTCTGGAGAA TGATATGGGA ACATAATGTG GAAGTTATTG TCATGATAAC AAACCTCGTG 3060 

GAGAAAGGAA GGAGAAAATG TGATCAGTAC TGGCCTGCCG ATGGGAGTGA GGAGTACGGG 312 0 

AACTTTCTGG TCACTCAGAA GAGTGTGCAA GTGCTTGCCT ATTATACTGT GAGGAATTTT 3180 

ACTCTAAGAA ACACAAAAAT AAAAAAGGGC TCCCAGAAAG GAAGACCCAG TGGACG7GTG 3240 

GTCACACAGT ATCACTACAC GCAGTGGCCT GACATGGGAG TACCAGAGTA CTCCCTGCCA 3300 

GTGCTGACCT TTGTGAGAAA GGCAGCCTAT GCCAAGCGCC ATGCAGTGGG GCCTGTTGTC 3360 

GTCCACTGCA GTGCTGGAGT TGGAAGAACA GGCACATATA TTGTGCTAGA CAGTATGTTG 3420 

CAGCAGATTC AACACGAAGG AACTGTCAAC ATATTTGGCT TCTTAAAACA CATCCG-TCA 3480 

CAAAGAAATT ATTTGGTACA AACTGAGGAG CAATATGTCT TCATTCATGA TACACTGGTT 354 0 

GAGGCCATAC TTAGTAAAGA AACTGAGGTG CTGGACAGTC ATATTCATGC CTATGTTAAT 3600 

GCACTCCTCA TTCCTGGACC AGCAGGCAAA ACAAAGCTAG AGAAACAATT CCAGCTCCTG 3660 

AGCCAGTCAA ATATACAGCA GAGTGACTAT TCTGCAGCCC TAAAGCAATG CAACAGGGAA 3720 

AAGAATCGAA CTTCTTCTAT CATCCCTGTG GAAAGATCAA GGGTTGGCAT TTCATCCCTG 3780 

AGTGGAGAAG GCACAGACTA CATCAATGCC TCCTATATCA TGGGCTATTA CCAGAGCAAT 384 0 

GAATTCATCA TTACCCAGCA CCCTCTCCTT CATACCATCA AGGATTTCTG GAGGATGATA 3900 

TGGGACCATA ATGCCCAACT GGTGGTTATG ATTCCTGATG GCCAAAACAT GGCAGAAGAT 3960 

GAATTTGTTT ACTGGCCAAA TAAAGATGAG CCTATAAATT GTGAGAGCTT TAAGGTCACT 4020 

CTTATGGCTG AAGAACACAA ATGTCTATCT AATGAGGAAA AACTTATAAT TCAGGACTTT 4080 

ATCTTAGAAG CTACACAGGA TGATTATGTA CTTGAAGTGA GGCACTTTCA GTGTCCTAAA 414 0 

TGGCCAAATC CAGATAGCCC CATTAGTAAA ACTTTTGAAC TTATAAGTGT TATAAAAGAA 42 0 0 

GAAGCTGCCA ATAGGGATGG GCCTATGATT GTTCATGATG AGCATGGAGG AGTGACGGCA 4260 

GGAACTTTCT GTGCTCTGAC AACCCTTATG CACCAACTAG AAAAAGAAAA TTCCGTGGAT 4320 

GTTTACCAGG TAGCCAAGAT GATCAATCTG ATGAGGCCAG GAGTCTTTGC TGACATTGAG 43 80 

CAGTATCAGT TTCTCTACAA AGTGATCCTC AGCCTTGTGA GCACAAGGCA GGAAGAGAAT 444 0 

CCATCCACCT CTCTGGACAG TAATGGTGCA GCATTGCCTG ATGGAAATAT AGCTGAGAGC 4500 

TTAGAGTCTT TAGTTTAACA CAGAAAGGGG TGGGGGGACT CACATCTGAG CATTGTTTTC 4560 

CTCTTCCTAA AATTAGGCAG GAAAATCAGT CTAGTTCTGT TATCTGTTGA TTTCCCATCA 462 0 
CCTGACAGTA ACTTTCATGA CATAGGATTC TGCCGCCAAA TTTATATCAT Tf " " " 

CAGTATTTCT AAGAATGGAA TTGTGGTATT TTTTTCTGTA TTGATTTTAA 

CAATTTATAG AGGTTAGGAA TTCCAAACTA CAGAAAATGT TTGTTTTTAG TGTCAAATTT 4860 

TTAGCTGTAT TTGTAGCAAT TATCAGGTTT GCTAGAAATA TAACTTTTAA TACAGTAGCC 4920 

TGTAAATAAA ACACTCTTCC ATATGATATT CAACATTTTA CAACTGCAGT ATTCACCTAA 4980 

AGTAGAAATA ATCTGTTACT TATTGTAAAT ACTGCCCTAG TGTCTCCATG GACCAAATTT 5040 

ATATTTATAA TTGTAGATTT TTATATTTTA CTACTGAGTC AAGTTTTCTA GTTCTGTGTA 5100 

ATTGTTTAGT TTAATGACGT AGTTCATTAG CTGGTCTTAC TCTACCAGTT TTCTGACATT 5160 

GTATTGTGTT ACCTAAGTCA TTAACTTTGT TTCAGCATGT AATTTTAACT TTTGTGGAAA 5220 

ATAGAAATAC CTTCATTTTG AAAGAAGTTT TTATGAGAAT AACACCTTAC CAAACATTGT 5280 

TCAAATGGTT TTTATCCAAG GAATTGCAAA AATAAATATA AATATTGCCA TTAAAAAAAA 534 0 
A AAAAAAAAAA AAAAAAA 

itein sequence: 
#: Eos sequence 

1 11 21 31 41 51 

I I I I I I 

MRILKRFLAC IQLLCVCRLD WANGYYRQQR KLVEEIGWSY TGALNQKtraG KKYPTCNSPK 60 

QSPINIDEDL TOVNVNLKKL KFQGWDKTSL ENTFIHNTGK TVEINLTNDY RVSGGVSEMV 120 

FKASKITFHW GKCNMSSDGS EHSLEGQKFP LEMQIYCFDA DRFSSFEEAV KGKGKLRALS 180 

ILFEVGTEEN LDFKAI IDGV ESVSRFGKQA ALDPFI LLNL LPNSTDKYYI YNGSLTSPPC 240 

TDTVDWIVFK DTVSISESQL AVFCEVLTHQ QSGYVMLMDY LQNNFREQQY KFSRQVFSSY 300 

TGKEEIHEAV CSSEPENVQA DPENYTSLLV TWERPRWYD TMIEKFAVLY QQLDGEDQTK 360 

HEFLTDGYQD LGAILNNLLP NMSYVLQI VA 1CTNGLYGKY SDQLIVDHPT DNPELDLFPE 42 0 

LIGTEEIIKE EEEGKDIEEG AIVWPGRDSA TNQIRKKEPQ ISTTTHYNRI GTKYNEAKTN 480 

RSPTRGSEFS GKGDVPNTSL NSTSQPVTKL ATEKDISLTS QTVTELPPHT VEGTSASLND 540 

GSKTVLRSPH MNLSGTAESL NTVSITEYEE ESLLTSFKLD TGAEDSSGSS PATSAIPFIS 600 

ENISQGYIFS SENPET1TYD VLIPESARNA SEDSTSSGSE ESLKDPSMEG NVWFPSSTDI 660 

TAQPDVGSGR ESFLQTNYTE IRVDESEKTT KSFSAGPVMS QGPSVTDLEM PHYSTFAYFP 720 

TEVTPHAFTP SSRQQDLVST VNWYSQTTQ PVYNAEASNS SHESRIGLAE GLSSEKKAVI 780 

PLVIVSALTF ICLWLVGIL IYWRKCFQTA HFYLEDSTSP RVISTPPTPI FP1SDDVGAI 840 

P1KHFPKHVA DLHASSGFTE EFETLKEFYQ EVQSCTVDLG ITADSSKHPD NKHKNRYINI 900 

VAYDHSRVKL AQLAEKDGKL TDYINANYVD GYNRPKAYIA AQGPLKSTAE DFWRMIWEHN 960 

VEVIVMITNL VEKGRRKCDQ YWPADGSEEY GNFLVTQKSV QVLAYYTVRN FTLRKTKIKK 1020 

GSQKGRPSGR WTQYHYTQW PDMGVPEYSL PVLT FVRKAA YAKRHAVGPV WHCSAGVGR 1080 

TGTYIVLDSM LQQIQHEGTV NIFGFLKHIR SQRNYLVQTE 3QYVFIKDTL VEAILSKETE 1140 

VLDSHIHAYV NALLIPGPAG KTKLEKQFQL LSQSNIQQSD YSAALKQCNR EKNRTSSIIP 1200 

VERSRVGISS LSGEGTDYIN ASYIMGYYQS NEFIITQHPL LHTIKDFWRM IWDHNAQLW 1260 
MIPDGQNMAE DEFVYWPNKD EPINCESFKV TLMAEEHKCL SNEEKLIIQD F I LEATQDD Y 
VLEVRHFQCP KWPNPDSPIS KTFELISVIK EEAANRDGPM IVHDSHGGVT AGTFCALTTL 

HHQLEKENSV DVYQVAKMIN LMRPGVFADI EQYQFLYKVI LSLVSTRQEE H 

AALPDGNIAE SLESLV 
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3 CACGCACGAT 



CAGCTCCTCT 



CCGCAGACCG 
GTGTTTGCCG 
AGATTGGCTG 



CTCACTTCGA TCTATACACT GGAGGATTAA AACAAACAAA 
CTCCCCCTCC CTCTCCACTC TGAGAAG 
TCTGGAAATG CGAATCCTAA AGCGTTTCCT C 
CCTGGATTGG GCTAATGGAT ACTACAGACA ACAGAGAAAA 



CAAGTAAATG T 



CCCAAAACAA 



TTTCAGGGTT 



GTCAGCGGAG 
AAATGCAATA 
GAGATGCAAA 
GGAAAAGGGA 
GATTTCAAAG 
TTAGATCCAT 
AATGGCTCAT 
ACAGTTAGCA 
TCTGGTTATG 
TTCTCTAGAC 
AGTTCAGAAC 



TGTCATCTGA 



AATGGTGTTT AAAGCAAGCA 
TGGATCAGAG CATAGTTTAG 
TGATGCAGAC CGATTTTCAA 



TGGAGTCGAA AGTGTTAGTC 
GAACCTTCTG CCAAACTCAA 
TCCCTGCACA GACACAGTTG 
CCAGTTGGCT GTTTTTTGTG 
GGACTACTTA CAAAACAATT 



ATCATTGGAA 
ATCTCACTAA TGACTACCGT 
AGATAACTTT TCACTGGGGA 
AAGGACAAAA ATTTCCACTT 
GTTTTGAGGA AGCAGTCAAA 
TTGGGACAGA AGAAAATTTG 
GTTTTGGGAA GCAGGCTGCT 
CTGACAAGTA TTACATTTAC 



CAGTTGGATG 
GGTGCTATTC 
TG CACTAATG 



GAAGAGGGAA 
AACCAAATCA 
ACGAAATACA 
AAGGGTGATG 



CTCGAGTCGT 1 
GAGAGGACCA 
TCAATAATTT 
GCTTATATGG 
TTGATCTTTT 
AAGACATTGA 
GGAAAAAGGA 
ATGAAGCCAA 
TTCCCAATAC 



GAAGGTACTT CAGCCTCTTT 



AGTTTATTGA 



AGCTTTCTCC 
TCCTTTTCTG 
CATTATTCTA 
TCCAGACAAC 
GTATACAATG 
GAATCCGAGA 
CTAGTGGTTC 



CCAGTTTCAA 
CTATCCCATT 
AGACAATAAC 
CTTCATCAGG 
CTAGCTCTAC 
AGACTAATTA 
CAGGCCCAGT 



AACCAAGCAT 
GCTACCCAAT 
AAAATACAGC 
CCCTGAATTA 
AGAAGGCGCT 
ACCCCAGATT 
GACTAACCGA 
ATCTTTAAAT 
GACTTCTCAG 
AAATGATGGC 
ATCCTTAAAT 
GCTTGATACT 



CCAGAGAATT 
ATGATTGAGA 
GAATTTTTGA 
ATGAGTTATG 
GACCAACTGA 
ATTGGAACTG 



TCTTGTTACA 
AGTTTGCAGT TTTGTACCAG 
CAGATGGCTA TCAAGACTTG 
TTCTTCAGAT AGTAGCCATA 
TTGTCGACAT GCCTACTGAT 
AAGAAATAAT C 



TCTACCACAA 



TCCACTTCCC 
ACTGTGACTG 
TCTAAAACTG 
ACAGTTTCTA 



ATATGATGTC 
TTCAGAAGAA 
AGACATAACA 



AACATATCCC 



CACACTACAA TCGCATAGGG 
GAGGAAGTGA ATTCTCTGGA 
AACCAGTCAC TAAATTAGCC 
AACTGCCACC TCACACTGTG 
TTCTTAGATC TCCACATATG 
TAACAGAATA TGAGGAGGAG 
ATTCTTCAGG CTCCAGTCCC 



TCACTAAAGG 



CTTCCCAACT GAGGTAACAC 



AT TT CAGATG 
CATGCAAGTA 
GGTATTACAG 
ATCGTTGCCT 
CTGACTGATT 
GCTGCCCAAG 
AATGTGGAAG 
CAGTACTGGC 



TTGTGGGTAT 
ACAGTACATC 
ATGTCGGAGC 



CAGACAGCTC 
ATGATCATAG 
ATATCAATGC 
GCCCACTGAA 
TTATTGTCAT 



CAACCACCCA 
CAGGGTTAAG 
CAATTATGTT 
ATCCACAGCT 
GATAACAAAC 



GACAACAAGC 
CTAGCACAGC 
GATGGCTACA 
GAAGATTTCT 



TGAGGGGTTG 
TTTTATCTGT 
TGCACACTTT 
2 CTCCAACACC TATCTTTCCA 
Z CAAAGCATGT TGCAGATTTA 
AG AG CTGTAC 
ACAAGAATCG 
TTGCTGAAAA 



AAGGGCTCCC 
TGGCCTGACA 
GCCTATGCCA 
AGAACAGGCA 
GTCAACATAT 



TTGCCTATTA 
AGAAAGGAAG 
TGGGAGTACC 
AGCGCCATGC 
CATATATTGT 



GAGGTGCTGG 
GGCAAAACAA 
GACTATTCTG 
CCTGTGGAAA 
AATGCCTCCT 



GTTATGATTC 
GATGAGCCTA 
CTATCTAATG 
TATGTACTTG 
AGTAAAACTT 



GATCAAGGGT 
ATATCATGGG 
CCATCAAGGA 
CTGATGGCCA 
TAAATTGTGA 



GCTAGACAGT 
AAAACACATC 
TCATGATACA 
TCATGCCTAT 
ACAATTCCAG 
GCAATGCAAC 



CTGCCAGTGC 
GTTGTCGTCC 
ATGTTGCAGC 
CGTTCACAAA 
CTGGTTGAGG 
GTTAATGCAC 
CTCCTGAGCC 



AGATTCAACA 
GAAATTATTT 
CCATACTTAG 
TCCTCATTCC 



TCAGAAGAGT 3120 

3240 
3300 
3360 



CTACACGCAG 
GAGAAAGGCA 
TGGAGT7GGA 
CGAAGGAACT 
GGTACAAACT 
TAAAGAAACT 
TGGACCAGCA 
ACAGCAGAGT 



CTTATGCACC 
AATCTGATGA 
ATCCTCAGCC 
GGTGCAGCAT 
AAGGGGTGGG 



CTATTACCAG 
TTTCTGGAGG 
AAACATGGCA 
GAGCTTTAAG 
TATAATTCAG 
CTTTCAGTGT 
AAGTGTTATA 
TGGAGGAGTG 



TCCCTGAGTG GAGAAGGCAC 
AGCAATGAAT TCATCATTAC 
ATGATATGGG ACCATAATGC 
GAAGATGAAT TTGTTTACTG 
GTCACTCTTA TGGCTGAAGA ACACAAATGT 
TAGAAGCTAC ACAGGATGAT 
CAAATCCAGA TAGCCCCATT 
AAAGAAGAAG CTGCCAATAG GGATGGGCCT 
TCTGACAACC 
CAAGATGATC 



TGCCTGATGG 



GGATTCTGCC 
TACTTATTAT 
GGTATTTTTT 



TTCTGTTATC 



CTTTGCTGAC ATTGAGCAGT 
AAGGCAGGAA GAGAATCCAT CCACCTCTCT 
AAATATAGCT GAGAGCTTAG AGTCTTTAGT 
TCTGAGCATT GTTTTCCTCT TCCTAAAATT 
TGTTGATTTC CCATCACCTG ACAGTAACTT 
TAT CATTAAC AATGTGTGCC TTTTTGCAAG 
AAATGATTGA ATTTTACAGT ATTTCTAAGA 
TTTTAACAGA AAATTTCAAT TTATAGAGGT 



3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 



TCATGACATA 4620 



406 
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AAACTACAGA AAATGTTTGT TTTTAGTGTC AAATTTTTAG CTGTATTTGT AGCAATTATC 48S0 

A GAAATATAAC TTTTAATACA GTAGCCTG7A AATAAAACAC 7CTTCCATAT 4920 

C TGCAGTATTC ACCTAAAGTA GAAATAATCT GTTACTTATT 4980 

GTAAATACTG CCCTAGTGTC TCCATGGACC AAATTTATAT TTATAATTGT AGATTTTTAT SO 40 

ATTTTACTAC TGAGTCAAGT TTTCTAGTTC TGTGTAATTG TTTAGTTTAA TGACGTAGTT 5100 

CATTAGCTGQ TCTTACTCTA CCAGTTTTCT GACATTGTAT TGTGTTACCT AAGTCATTAA 5160 

CTTTGTTTCA GCATGTAATT TTAACTTTTG TGG AAAATAG AAATACCTTC ATTTTGAAAG 5220 

AAGTTTTTAT GAGAATAACA CCTTACCAAA CATTGTTCAA ATGGTTTTTA TCCAAGGAAT 5280 

^ TTGCCATTAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 5340 
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TDTVDWIVFK 
TGKEEIHEAV 
HEPLTDGYQD 
LIGTEEIIKE 
RSPTRGSEFS 
GSKTVLRSPH 
ENISQGYIPS 
TAQPDVGSGR 
TEVTPHAFTP 
LVIVSALTPI 
IKHFPKHVAD 
KLAQLAEKDG 



LDPKAIIDGV 
DTVSISESQL 
CSSEPENVQA 
LGAILNNLLP 
EEEGKDIEEG 
GKGDVPNTSL 
MNLSGTAESL 



WANGYYRQQR 
KFQGWDKTSL 
EHSJjEGQKFP 
ESVSRFGKOA 
AVFCEVLTMQ 
DPENYTSLLV 
NMSYVLQIVA 
AIVNPGRDSA 
NSTSQPVTKL 



ENTFIHNTGK 
ALDPFILLNL 



TWERPRWYD 



ATEKDISLTS 



LPNSTDKYYI 
LQNNFREQQY 
TMIEKFAVLY 
SDQLIVDMPT 
ISTTTHYNRI 
QTVTSLPPHT 



SENPETITYD VLIPESARNA SEDSTSSGSE ESLKDPSHEG 



GRWTQYHYT 
SMLQQIQHEG 
YVNALLIPGP 
SSLSGEGTDY 



ESFLQTNYTE 
SSRQQDLVST 
CLWLVGILI 
LHASSGFTEE 
KLTDYINANY 
DQYWPADGSE 
QWPDMGVPEY 
TVNIFGFLKH 
AGKTKLEKQF 
INASYIMGYY 



VDGYNRPKAY 
EYGNFLVTQK 
SLPVLTFVRK 
IRSQRNYLVQ 
OLLSQSNIQQ 



PVYNEASNSS 
FYLEDSTSPR 
LGITADSSNH 
IAAQGPLKST 
SVOVLAYYTV 
AAYAKRHAVG 
TEEQYVFIHD 
SDYSAALKQC 
PLLHTIKDFW 



QGPSVTDLEM 
HESRIGLAEG 
VISTPPTPIF 



YNGSLTSPPC 
KFSRQVFSSY 
QQLDGEDQTK 
DNPELDLFPE 
GTKYNEAKTN 
VEGTSASLND 
PATSAIPFIS 
NVWFPSSTDI 



LESEKKAVIP 
PISDDVGAIP 
NIVAYDHSRV 
HNVEVIVMIT 
KKGSQKGRPS 
GRTGTYIVLD 



TFELISV 

SVDVYQVAKM INLMRPGVFA DIEQYQFLYK V 



QDFILEATQD DYVLEVRHFQ 
VTAGTFCALT TLMHQLEKEN 
EENPSTSLDS NGAALPDGNI 



50 
55 



CACACATACG CACGCACGAT CTCACTTCGA TCTATACACT GGAGGATTAA AACAAACAAA 
CAAAAAAAAC ATTTCCTTCG CTCCCCCTCC CTCTCCACTC TGAGAAGCAG AGGAGCCGCA 
CGG CGAGGGG CCGCAGACCG TCTGGAAATG CGAATCCTAA AGCGTTTCCT CGCTTGCATT 
CAGCTCCTCT GTGTTTGCCG CCTGGATTGG GCTAATGGAT ACTACAGACA ACAGAGAAAA 
1 GTCCTATACA GGAGCACTGA ATCAAAAAAT TGGGGAAAGA 



GTCATCTGAT 



ATGGCTCATT 



' CTGGTTATGT 
TCTCTAGACA 
GTTCAGAACC 
GGGAAAGACC 



GTTAAGAGCT 
GATTATTGAT 
CATACTGTTG 
GACATCTCCT 
CTCTGAAAGC 



AACCTTCTGC 



GTGCTATTCT 
GCACTAATGG 
ATCCTGAACT 
AAGAGGGAAA 
ACCAAATCAG 



AGAAAATGTT 
TCGAGTCGTT 
AGAGGAC CAA 
CAATAATTTG 
CTTATATGGA 
TGATCTTTTC 
AGACATTGAA 
GAAAAAGGAA 



GACTACTTAC 
TCATACACTG 
CAGGCTGACC 
TATGATACCA 



TGTTTGAGGT 
GTGTTAGTCG 
CAAACTCAAC 
ACACAGTTGA 
TTTTTTGTGA 
AAAACAATTT 



TGGGACAGAA 
TTTTGGGAAG 
TGACAAGTAT 
CTGGATTGTT 



TCGAGAGCAA 
GATTCATGAA 
TACCAGCCTT 
GTTTGCAGT7 
AGATGGCTAT 



CTACCCAATA TGAGTTATGT TCTTCAGATA 
AAATACAGCG 
CCTGAATTAA 
GAAGGCGCTA 



\ TATTTCCTTG 

GACTGCAGAA 
CAGTTTCAAG 
TATCCCATTC 
GACAATAACA 
TTCATCAGGT 
TAGCTCTACA 
GACTAATTAC 
AGGCCCAGTG 



ACTTCTCAGA 
AATGATGGCT 
TCCTTAAATA 
CTTGATACTG 



CTACCACAAC 
CCCCAACAAG 
CCACTTCCCA 



CTAAAACTGT 
CAGTTTCTAT 
GAGCTGAAGA 



AGAAATAATC 
TGGTAGAGAC 
ACACTACAAT 
AGGAAGTGAA 
ACCAGTCACT 
ACTGCCACCT 
TCTTAGATCT 
AACAGAATAT 
TTCTTCAGGC 



GAAAATTTGG 
CAGGCTGCTT 
TACATTTACA 
TTTAAAGATA 
ATGCAACAAT 
CAGTACAAGT 
GCAGTTTGTA 
CTTGTTACAT 
TTGTACCAGC 
CAAGACT7GG 
GTAGCCATAT 
CCTACTGATA 
AAGGAGGAGG 
AGTGCTACAA 



AAATTAGCCA 
CACACTGTGG 
CCACATATGA 



2100 
2280 



407 
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ATTATTCTAC CTTTGCCTAC 
CCAGACAACA GGATTTGGTC 
TATACAATGA GGCCAGTAAT 
AATCCGAGAA GAAGGCAGTT 
TAGTGGTTCT TGTGGGTATT 



ATACCCCTTG 
CTCATCTACT 
CCTAGAGTTA 



TCATGCTTTT ACCCCATCCT 
CTCGCAGACA ACCCAACCGG 
TGGTCTAGCT GAGGGGTTGG 
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r GAAGAATTTG 



AGCACTTTCC 
CAGACAGCTC 



CTGACTGATT 



AAGGAAGGAG 



ATGGGAACAT 



GCCCACTGAA 



TCAGAAGAGT 



CAGTACTGGC 
GTGCAAGTGC 
AAGGGCTCCC 



CTTCCAGACT 
TCCA^CACCT 
AAAGCATGTT 
AGAGTTTTAC 
CAACCACCCA 
CAGGGTTAAG 
CAATTATGTT 
ATCCACAGCT 
GATAACAAAC 



A TAC7GTGAGG 



AGATTCAACA C 



CCATACTTAG 
TCCTCATTCC 
AGTCAAATAT 
ATCGAACTTC 
GAGAAGGCAC 
-TCATCATTAC 
ACCATAATGC 
TTGTTTACTG 
TGGCTGAAGA 
TAGAAGCTAC 
CAAAT CCAGA 
CTGCCAATAG 

ACCAGGTAGC 
ATCAGTTTCT 



GCCAAATAAA 
ACACAAATGT 
ACAGGATGAT 
TAGCCCCATT 
GGATGGGCCT 



CAAGATGATC 
CTACAAAGTG 
GGACAGTAAT 
TTAACACAGA 



GCCTATGCCA 
AGAACAGGCA 
GTCAACATAT 
GAGGAGCAAT 
GAGGTGCTGG 
GGCAAAACAA 
GACTATTCTG 
CCTGTGGAAA 
AATGCCTCCT 
CTCCTTCATA 
GTTATGATTC 
GATGAGCCTA 
CTATCTAATG 
TATGTACTTG 
AGTAAAACTT 
ATGATTGTTC 
CTTATGCACC 
AATCTGATGA 
ATCCTCAGCC 



TGGGAGTACC 
AGCGCCATGC 
CATATATTGT 



ATGTCTTCA? 
AGCTAGAGAA 



AGAGTACTCC 
AGTGGGGCCT 
GCTAGACAGT 
r AAAACACATC C 



ATATCATGGG 
CCATCAAGGA 
CTGATGGCCA 
TAAATTGTGA 



AAGTGAGGCA 
TTGAACTTAT 
ATGATGAGCA 



GCAATGCAAC 
TGGCATTTCA 
CTATTACCAG 
TTTCTGGAGG 
AAACATGGCA 
GAGCTTTAAG 
TATAATTCAG 
CTTTCAGTGT 
AAGTGTTATA 



AAGGGGTGGG 



TTTTTGCAAG 
ATTTCTAAGA 
TTATAGAGGT 
CTGTATTTGT 
AATAAAACAC 
GAAATAATCT 



TCATGACATA 



TGTGTTACCT 
AAATACCTTC 
ATGGTTTTTA 
AAAAAAAAAA 



ATGGAATTGT 
TAGGAATTCC 
AGCAATTATC 
TCTTCCATAT 
GTTACTTATT 
AGATTTTTAT 
TGACGTAGTT 
AAGTCATTAA 
ATTTTGAAAG 
TCCAAGGAAT 



GGATTCTGCC 
TACTTATTAT 
GGTATTTTTT 
AAACTACAGA 



TTCTGTTATC 



GTAAATACTG 
ATTTTACTAC 
CATTAGCTGG 
CTTTGTTTCA 



TGCAAAAATA 



TCTGTATTGA 
AAATGTTTGT 
GAAATATAAC 
ATTTTACAAC 
CCCTAGTGTC 
TGAGTCAAGT 
TCTTACTCTA 
GCATGTAATT 
GAGAATAACA 



AGAAAATTCC 

CTTTGCTGAC 
AAGGCAGGAA 
AAATATAGCT 
TCTGAGCATT 
TGTTGATTTC 
TATCATTAAC 
AAATGATTGA 
ITTTAACAGA 
TTTTAGTGTC 
TTTTAATACA 
TGCAGTATTC 
TCCATGGACC 
TTTCTAGTTC 
CCAGTTTTCT 
TTAACTTTTG 
CCTTACCAAA 
TTGCCATTAA 



2400 
2450 

2640 



CAGGAAGTGC 



GATGGCTACA 



TACGGGAACT 



CGTGTGGTCA 
CTGCCAGTGC 
GTTGTCGTCC 



TCCCTGAGTG 



ATGA7ATGGG 
GAAGATGAAT 
GTCACTCTTA 
GACTTTATCT 
CCTAAATGGC 
AAAGAAGAAG 
ACGGCAGGAA 
GTGGATGTTT 
ATTGAGCAGT 
GAGAATCCAT 



GTTTTCCTCT 
CCATCACCTG 
AATGTGTGCC 
ATTTTACAGT 
AAATTTCAAT 
AAATTTTTAG 
GTAGCCIGTA 



AAATTTATAT 
TGTGTAATTG 
GACATTGTAT 
TGGAAAATAG 
CATTGTTCAA 



I 

MVFKASKITF 
LSILFEVGTE ENLDFKAI ID 
PCTDTVDWIV FKDTVSISES 
SYTGKBEIHE AVCSSEPBNV 
TKHEFLTDGY QDLGAI bNNL 
PELIGTEEII KEEEEGKDIE 
TNRSPTRGSE FSGKGDVPNT 
NDGSKTVLRS PHMNLSGTAE 
ISENISQGYI 



QLAVFCEVLT 
QADPENYTSL 
LPNMSYVLQI 
EGAIVNPGRD 
SLNSTSQPVT 



FPLEMQIYCF DADRFSSFEE 
QAALDPFILL NLLPNSTDKY 
MQQSGYVMLM DYLQNNFREQ 



AVKGKGKLRA 



VAICTNGLYG KYSDQLIVDM 
SATNQIRKKE PQISTTTHYN 
TSQTVTELPP 



K LDTGAEDSSG 



PTDNPELDLF 300 

RIGTKYNEAK 3S0 

HTVEGTEASL 420 

SSPATSAIPF 480 



YDVLIPESAR NASEDSTSSG SEESLKDPSM EGNVWFPSST 



TPSSRQQDLV 
FICLWLVGI 
ADLHASSGFT 
LAQLAEKDGK 
LVEKGRRKCD 



STVNWYSQT T 
LIYWRKCFQT A 
EEFETLKEFY Q 



EVLDSHIHAY 
PVERSRVGIS 
VMIPDGQNMA 
YVLEVRHFQC 
LMHQLEKENS 



SLSGEGTDYI 
EDEFVYWPNK' 
PKWPNPDSPI 
VDVYQVAKMI 
ESLESLV 



TKSFSAGPV 
TQPVYNEASN 
AHFYLEDSTS 
QEVQSCTVDL 
DGYNRPKAYI 
YGNFLVTQKS 
LPVLTFVRKA 
RSQRNYLVQT 
LIjSQSNIQQS 

VTLMAEEHKC 
KEEAANRDGP 
NLMRPGVFAD IEQYQFLYKV 



GITADSSNHP 
AAQGPLKSTA 
VQVLAYYTVR 
AYAKRHAVGP 
.EEQYVFIHDT 
DYSAALKQCN 
LLHTIKDFWR 
IQ 



EMPHYSTFAY 600 



ILSLVSTRQE 



NFTLRNTKIK 
WVHCSAGVG 
LVEAILSKET 
REKNRTSSII 
MIKDHNAQLV 
DFILEATQDD 
TAGTFCAL7T 



408 



20 
25 
30 
35 
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I I 

CACACATACG CACGCACGAT 
CAAAAAAAAC ATTTCCTTCG 
CGGCGAGGGG CCGCAGACCG 
CAGCTCCTCT GTGTTTGCCG 
CTTGTTGAAG AGATTGGCTG 
AAATATCCAA CATGTAATAG 
3 TGAATCTTAA 
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CTCACTTCGA T 



I GGAGGATTAA AACAAACAAA 



GTCAGCGGAG GAGTTTCAGA 
AAATGCAATA TGTCATCTGA 
GAGATGCAAA TCTACTGCTT 
GGAAAAGGGA AGTTAAGAGC 
GATTTCAAAG CGATTATTGA 
T TCATACTGTT 



CCTGGATTGG 
CCCAAAACAA 
TGGGAAAACA 
TGGATCAGAG 



GCTAATGGAT 
GGAGCACTGA 
TCTCCTATCA 



ACTACAGACA 



ACAGAGAAAA 
TTGGGGAAAG 
AGATCTTACA 

TGACTACCGT 



TGGAGTCGAA 
GAACCTTCTG 
TCCCTGCACA 
CCAGTTGGCT 
GGACTACTTA 



CGATTTTCAA 
AGTGTTAGTC 



AAGGACAAAA 
TTGGGACAGA 
CTGACAAGTA 



GTTTTTTGTG 



TTCGAGAGCA 



AATGCAACAA 



TGGGAAAGAC 
CAGTTGGATG 
GGTGCTATTC 
TGCACTAATG 
AATCCTGAAC 



AACCAAATCA 



ACAGAAAAAG 
GAAGGTACTT 
AACTTGTCGG 
AGTTTATTGA 



CAGAAAATGT 
CTCGAGTCGT 
GAGAGGACCA 
TCAATAATTT 
GCTTATATGG 
TTGATCTTTT 
AAGACATTGA 
GGAAAAAGGA 
ATGAAGCCAA 
TTCCCAATAC 
ATATTTCCTT 
CAGCCTCTTT 
GGACTGCAGA 
CCAGTTTCAA 
CTATCCCATT 
AGACAATAAC 
CTTCATCAGG 
CTAGCTCTAC 



TCAGGCTGAC 
TTATGATACC 
AACCAAGCAT 
GCTACCCAAT 
AAAATACAGC 

AGAAGGCGCT 
ACCCCAGATT 
GACTAACCGA 
ATCTTTAAAT 
GACTTCTCAG 
AAATGATGGC 
ATCCTTAAAT 
GCTTGATACT 
CATCTCTGAG 
ATATGATGTC 
TTCAGAAGAA 
AGACATAACA 



A AGTTTGCAGT TTTGTACCAG 



ATGAGTTATG 



ATTGGAACTG 
ATTGTGAATC 
TCTACCACAA 
TCCCCAACAA 
TCCACTTGCC 
ACTGTGACTG 
TCTAAAACTG 
ACAGTTTCTA 
GGAGCTGAAG 
AACATATCCC 



TCACTAAAGG 



AGAAGGCAGT 
ACAGTACATC 



CACAAGAATC 



CTGTTGACTT 



TAGTAGCCAT 
TATACCCCTT 
TCTCATCTAC 
CCCTAGAGTT 
AATTCCAATA 
TGAAGAATTT 



AGGATGGCAA 
AAGCTTATAT 
TATGGGAACA 
GAAAATGTGA 



TATCGTTGCC 



TGGAGGAAAT 
ATATCCACAC 
AAGCACTTTC 
GAGACACTGA 
GCAGACAGCT 



TTGTCGACAT 
AAGAAATAAT 
CTGGTAGAGA 
CACACTACAA 
GAGGAAGTGA 
AACCAGTCAC 
AACTGCCACC 
TTCTTAGATC 
TAACAGAATA 
ATTCTTCAGG 
AAGGGTATAT 
AATCTGCTAG 
ATCCTTCTAT 
ATGTTGGATC 
AATCTGAGAA 
TTACAGATCT 
CTCATGCTTT 
ACTCGCAGAC 
TTGGTCTAGC 
CAGCCCTGAC 
GCTTCCAGAC 
CTCCAACACC 
CAAAGCATGT 



CTCCAGTCCC 
ATTTTCCTCC 
AAATGCTTCC 
GGAGGGAAAT 



GACAACCAAG 
GGAAATGCCA 
TACCCCATCC 
AACCCAACCG 



TTTTATCTGT 
TGCACACTTT 
TATCTTTCCA 
TGCAGATTTA 
CCAGGAAGTG 



2160 
2220 
2280 
2310 
2400 
2460 



55 
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CTAAGAAACA 
ACACAGTATC 
CTGACCTTTG 
CACTGCAGTG 
CAGATTCAAC 
AGAAATTATT 
GCCATACTTA 
CTCCTCATTC 
CTGTCACCCA 
GGCTTAACTG 
TCAAATATAC 
CGAACTTCTT 



CAAAAATAAA 



TGAGAAAGGC 
CTGGAGTTGG 
ACGAAGGAAC 
TGGTACAAAC 
GTAAAGAAAC 
CTGGACCAGC 



TGCTGCCCAA 
TAATGTGGAA 
TCAGTACTGG 
TGTG CAAGTG 
AAAGGGCTCC 
GTGGCCTGAC 
AGCCTATGCC 
AAGAACAGGC 
TGTCAACATA 
TGAGGAGCAA 



TATATCAATG 



AATCCACAGC 
TGATAACAAA 
GGAGTGAGGA 
ATACTGTGAG 



ATCATTACCC 
CATAATGCCC 
GTTTACTGGC 



ATCCTCCTAC 
AGCAGAGTGA 
CTATCATCCC 
ACTACAT CAA 
AGCACCCTCT 
AACTGGTGGT 



AGGCAAAACA 
CAGAGGCACA 
CTCAGCCTCC 
CTATTCTGCA 
TGTGGAAAGA 
TGCCTCCTAT 
CCTTCATACC 
TATGATTCCT 



GACAGTCATA 
AAGCTAGAGA 
ATCTCGGCTC 



GAAGCTACAC 
GCCAATAGGG 



ACAAATGTCT 
AGGA1GATTA 
GCCCCATTAG 
ATGGGCCTAT 
TGACAACCCT 



ATCTAATGAG 
TGTACTTGAA 
TAAAACTTTT 
GATTGTTCAT 
TATGCACCAA 



ATCAAGGATT 
GATGGCCAAA 
AATTGTGAGA 
GAAAAACTTA 
GTGAGGCACT 



TGCTAGACAG 
TAAAACACAT 
TTCATGA7AC 
TTCATGCCTA 
AACAATTCCA 
ACTGCAACCT 
GGACTATACT 
AATGCAACAG 
GCA7TTCATC 
ATTACCAGAG 
TCTGGAGGAT 
ACATGGCAGA 



TGTTAATGCA 
GGGTC7CACT 
TCCTCTCCCT 



CAGGTAGCCA 
CAGTTTCTCT 
ACCTCTCTGG ACAGTAATGG TGCAGCATTG 
TCTTTAGTTT 



CTAGAAAAAG 
CCAGGAGTCT 
GTGGGCACAA 



TAATTCAGGA 
TTCAGTGTCC 
GTGTTATAA^i 
GAGGAGTGAC 
AAAATTCCGT 



CCGTTCACAA 3480 



3660 
372 0 

3840 
3900 
3960 
4020 
4080 

4200 
4260 
4320 
4380 

4500 
4560 



AGATGAATTT 
CACTCTTATG 
CTTTA7CTTA 



AGAAGAAGCT 
GGCAGGAACT 
GGATGTTTAC 



AGTAACTTTC 



TTCTAAGAAT GGAATTGTGG TATTTT1 
ATAGAGGTTA GGAATTCCAA ACTACAGAAA ATGTTTGTTT 
GTATTTGTAG CAATTATCAG GTTTGCTAGA 
TAAAACACTC TTCCATATGA TATTCAACAT 
AATAATCTGT TACTTATTGT AAATACTGCC CTAGTGTCTC 



GGCAGGAAGA GAATCCAICC 
ATATAGCTGA GAGCTTAGAG 
TGAGCATTGT TTTCCTCTTC 
TTGATTTCCC ATCACCTGAC 
TCATTAACAA TGTGTGCCTT 
TTTACAGTAT 



TTAATACAGT AGCCTGTAAA 
CAGTATTCAC CTAAAGTAGA 
CATGGACCAA ATTTATATTT 
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ATAATTGTAG ATTTTTATAT TTTACTACTG AGTCAAGTTT TCTAGTTCTG TGTAATTGTT 5220 

TAGTTTAATG ACGTAGTTCA TTAGCTGGTC TTACTCTACC AGTTTTCTGA CATTGTATTG 5280 

TGTTACCTAA GTCATTAACT TTGTTTCAGC ATGTAATTIT AACTTTTGTG GAAAATAGAA 5340 

ATACCTTCAT TTTGAAAGAA GTTTTTATGA GAATAACACC TTACCAAACA TTGTTCAAAT 5400 

GGTTTTTATC CAAGGAATTG CAAAAATAAA TATAAATATT GCCATTAAAA AAAAAAAAAA 5460 
AAAAAAAAAA AAAAAAAAAA A 

Seg ID NO: 581 Protein sequence: 
Protein Accession #: EOS sequence 

i i 1 r i 1 r i 1 

MRILKRFLAC IQLLCVCRLD WANGYYRQQR KLVEEIGWSY TGALNQKNWG KKYPTCNSPK SO 

QSPINIDEDL TQVNVNLKKL KFQGWDKTSL ENTFIHNTGK TVEINLTNDY RVSGGVSEMV 120 

PKASKITFHW GKCNMSSDGS EHSLEGQKFP LEMQIYCPDA DRFSSPEEAV KGKGKLRALS 180 

ILFEVGTEEN LDFKAIIDGV ESVSRFGKQA ALDPFILLNL LPNSTDKYYI YNGSLTSPPC 240 

TDTVDWIVFK DTVSISESQL AVFCEVLTMQ QSGYVMLMDY LQNNFREQQY KFSRQVFSSY 300 

TGKEEIHEAV CSSEPENVQA DPENYTSLLV TWERPRWYD TMIEKFAVLY QQLDGEDQTK 360 

HEFLTDGYQD LGAILNNLLP NMSYVLQIVA ICTNGLYGKY SDOLIVDMPT DNPELDLFPE 420 

LIGTEEIIKE EEEGKDIEEG AIVHPGRDSA TNQIRKKEPQ ISTTTHYNRI GTKYNEAKTN 480 

RSPTRGSEFS GKGDVPNTSL NSTSQPVTKL ATEKDISLTS QTVTELPPHT VEGTSASLND 540 

GSKTVLRSPH MNLSGTAESL NTVS ITEYEE ESLLTSPKLD TGAEDSSGSS PATSAIPFIS 600 

ENISQGYIFS SENPETITYD VLIPESARKA SEDSTSSGSE ESLKDPSMEG NVNFPSSTDI 660 

TAQPDVGSGR ESFLQTNYTE IRVDESEKTT KSFSAGPVMS QGPSVTDLEM PHYSTFAYFP 72 0 

TEVTPHAFTP SSRQQDLVST VNWYSQTTQ PVYNEASNSS HESRIGLAEG LESEKKAVIP 780 

LVIVSALTFI CLWLVGILI YWRKCFQTAH FYLEDSTSPR VISTPPTPIF PISDDVGAIP 840 

IKHFPKHVAD LHASSGFTEE FETLKEPYQE VQSCTVDLGI TADSSNHPDN KHXMRYINIV 900 

AYDHSRVKLA QLAEKDGKLT DYINANYVDG YNRPKAYIAA QGPLKSTAED FWRMIWEHNV 960 

EVIVMITNLV EKGRRKCDQY WPADGSEEYG NFLVTQKSVQ VLAYYTVRNF TLRNTKIKKG 1020 

SQKGRPSGRV VTQYHYTQWP DMGVPEYSLP VLTFVRKAAY AKRHAVGPW VHCSAGVGRT 1080 

GTYIVLDSML QQIQHEGTVN IFGFLKHIRS QRNYLVQTEE QYVFIHDTLV EAILSKETEV 1140 

LDSHIHAYVN ALLIPGPAGK TKLEKQFQGL TLSPRLECRG TISAHCNLPL PGLTDPPTSA 1200 
SRVAGTILLS QSNIQQSDYS AALKQCNREK NRTSSIIPVE RSRVGIS 
YIMGY 'QGNE FIITQHPLLH TIKDFWRMIW DHNAQLWMI PDGQMMAEDE F 
INCESFKVTL MAEEHKCLSN EEKLIIQDFI LEATQDDYVL EVRHFQCPKW P 
PELISVIKEE AANRDGPMIV HDEHGGVTAG TFCALTTLMH QLEKENSVDV YQVAKMINLM 
RPGVFADIEQ YQFLYKVILS LVGTRQEENP STSLDSNGAA LPDGMIAESL ESLV 



1 11 21 31 41 SI 

I I I I > I 

CACACATACG CAGGCACGAT CTCACTTCGA TCTATACACT GGAGGATTAA AACAAACAAA 60 

CAAAAAAAAC ATTTCCTTCG CTCCCCCTCC CTCTCCACTC TGAGAAGCAG AGGAGCCGCA 12 0 

CGGCGAGGGG CCGCAGACCG TCTGGAAATG CGAATCCTAA AGCGTTTCCT CGCTTGCATT 18 0 

CAGCTCCTCT GTGTTTGCCG CCTGGATTGG GCTAATGGAT ACTACAGACA ACAGAGAAAA 240 

CTTGTTGAAG AGATTGGCTG GTCCTATACA GGAGCACTGA ATCAAAAAAA TTGGGGAAAG 300 

AAATATCCAA CATGTAATAG CCCAAAACAA TCTCCTATCA ATATTGATGA AGATCTTACA 360 

CAAGTAAATG TGAATCTTAA GAAACTTAAA TTTCAGGGTT GGGATAAAAC ATCATTGGAA 420 

AACACATTCA TTCATAACAC TGGGAAAACA GTGGAAATTA ATCTCACTAA TGACTACCGT 480 

GTCAGCGGAG GAGTTTCAGA AATGGTGTTT AAAGCAAGCA AGATAACTTT TCACTGGGGA 54 0 

AAATGCAATA TGTCATCTGA TGGATCAGAG CATAGTTTAG AAGGACAAAA ATTTCCACTT 600 

GAGATGCAAA TCTACTGCTT TGATGCGGAC CGATTTTCAA GTTTTGAGGA AGCAGTCAAA 660 

GGAAAAGGGA AGTTAAGAGC TTTATCCATT TTGTTTGAGG TTGGGACAGA AGAAAATTTG 720 

GATTTCAAAG CGATTATTGA TGGAGTCGAA AGTGTTAGTC GTTTTGGGAA GCAGGCTGCT 780 

TTAGATCCAT TCATACTGTT GAACCTTCTG CCAAACTCAA CTGACAAGTA TTACATTTAC 84 0 

AATGGCTCAT TGACATCTCC TCCCTGCACA GACACAGTTG ACTGGATTGT 7TTTAAAGAT 900 

ACAGTTAGCA TCTCTGAAAG CCAGTTGGCT GTTTTTTGTG AAGTTCTTAC AATCCAACAA 960 

TCTGGTTATG TCATGCTGAT GGACTACTTA CAAAACAATT TTCGAGAGCA ACAGTACAAG 1020 

TTCTCTAGAC AGGTGTTTTC CTCATACACT GGAAAGGAAG AGATTCATGA AGCAGTTTGT 1080 

AGTTCAGAAC CAGAAAATGT TCAGGCTGAC CCAGAGAATT ATACCAGCCT 7CTTGTTACA 114 0 

TGGGAAAGAC CTCGAGTCGT TTATGATACC ATGATTGAGA AGTTTGCAGT 7TTGTACCAG 12 0 0 

CAGTTGGATG GAGAGGACCA AACCAAGCA1 GAATTTTTGA CAGATGGCTA TCAAGACTTG 1260 

GGTGCTATTC TCAATAATTT GCTACCCAAT ATGAGTTATG TTCTTCAGAT AGTAGCCATA 1320 

TGCACTAATG GCTTATATGG AAAATACAGC GACCAACTGA TTGTCGACAT GCCTACTGAT 13 80 

AATCCTGAAC TTGATCTTTT CCCTGAATTA ATTGGAACTG AAGAAATAAT CAAGGAGGAG 1440 

GAAGAGGGAA AAGACATTGA AGAAGGCGCT ATTGTGAATC CTGGTAGAGA CAGTGCTACA 1500 

AACCAAATCA GGAAAAAGGA ACCCCAGATT TCTACCACAA CACACTACAA TCGCATAGGG 1560 

ACGAAATACA ATGAAGCCAA GACTAACCGA TCCCCAACAA GAGGAAGTGA ATTCTCTGGA 1620 

AAGGG1GATG TTCCCAATAC ATCTTTAAAT TCCACTTCCC AACCAGTCAC TAAATTAGCC 1680 

ACAGAAAAAG ATATTTCCTT GACTTCTCAG ACTGTGACTG AACTGCCACC TCACACTGTG 174 0 

GAAGGTACTT CAGCCTCTTT AAATGATGGC TCTAAAACTG TTCTTAGATC TCCACATATG 1800 

AACTTGTCGG GGACTG CAGA ATCCTTAAAT ACAGTTTCTA TAACAGAATA 7GAGGAGGAG 1860 

AGTTTATTGA CCAGTTTCAA GCTTGATACT GGAGCTGAAG ATTCTTCAGG CTCCAGTCCC 1920 

GCAACTTCTG CTATCCCATT CATCTCTGAG AACATATCCC AAGGGTATAT ATTTTCCTCC 1980 

GAAAACCCAG AGACAATAAC ATATGATGTC CTTATACCAG AATCTGCTAG AAATGCTTCC 2040 

GAAGATTCAA CTTCATCAGG TTCAGAAGAA T CACTAAAGG ATCCTTCTAT GGAGGGAAAT 2100 

GTGTGGTTTC CTAGCTCTAC AGACATAACA GCACAGCCCG ATGTTGGATC AGGCAGAGAG 2160 

AGCTTTCTCC AGACTAATTA CACTGAGATA CGTGTTGATG AATCTGAGAA GACAACCAAG 2220 

TCCTTTTCTG CAGGCCCAGT GATGTCACAG GGTCCCTCAG TTACAGATCT GGAAATGCCA 2280 

CATTATTCTA CCTTTGCCTA CTTCCCAACT GAGGTAACAC CTCATGCTTT TACCCCATCC 2340 

T CCAGACAAC AGGATTTGGT CTCCACGGTC AACGTGGTAT ACTCGCAGAC AACCCAACCG 2400 

GTATACAATG GTGAGACACC TCTTCAACCT TCCTACAGTA GTGAAGTCTT TCCTCTAGTC 2460 

ACCCCTTTGT TGCTTGACAA TCAGATCCTC AACACTACCC CTGCTGCTTC AAGTAGTGAT 2520 
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TCGGCCTTGC ATGCTACGCC TGTATTTCCC AGTGTCGATG TGTCATTTGA ATCCATCCTG 2S80 

TCTTCCTATG ATGGTGCACC TTTGCTTCCA TTTTCCTCTG CTTCCTTCAG TAGTGAATTG 2640 

TTTCGCCATC TGCATACAGT TTCTCAAATC CTTCCACAAG TTACTTCAGC TACCGAGAGT 2700 

GATAAGGTGC CCTTGCATGC TTCTCTGCCA GTGGCTGGGG GTGATTTGCT ATTAGAGCCC 2760 

AGCCTTGCTC AGTATTCTGA TGTGCTGTCC ACTACTCATG CTGCTTCAGA GACGCTGGAA 2820 

TTTGGTAGTG AATCTGGTGT TCTTTATAAA ACGCTTATGT TTTCTCAAGT TGAACCACCC 2880 

AGCAGTGATG CCATGATGCA TGCACGTTCT TCAGGGCCTG AACCTTCTTA TGCCTTGTCT 2940 

GATAATGAGG GCTCCCAACA CATCTTCACT GTTTCTTACA GTTCTGCAAT ACCTGTGCAT 3000 

GATTCTGTGG GTGTAACTTA TCAGGGTTCC TTATTTAGCG GCCCTAGCCA TATACCAATA 3060 

CCTAAGTCTT CGTTAATAAC CCCAACTGCA TCATTACTGC AGCCTACTCA TGCCCTCTCT 3120 

GGTGATGGGG AATGGTCTGG AGCCTCTTCT GATAGTGAAT TTCTTTTACC TGACACAGAT 3180 

GGGCTGACAG CCCTTAACAT TTCTTCACCT GTTTCTGTAG CTGAATTTAC ATATACAACA 3240 

TCTGTGTTTG GTGATGATAA TAAGGCGCTT TCTAAAAGTG AAATAATATA TGGAAATGAG 33 00 

ACTGAACTGC AAATTCCTTC TTTCAATGAG ATGGTTTACC CTTCTGAAAG CACAGTCATG 3360 

CCCAACATGT ATGATAATGT AAATAAGTTG AATGCGTCTT TACAAGAAAC CTCTGTTTCC 3420 

ATTTCTAGCA CCAAGGGCAT GTTTCCAGGG TCCCTTGCTC ATACCACCAC TAAGGTTTTT 3480 

GATCATGAGA TTAGTCAAGT TCCAGAAAAT AACTTTTCAG TTCAACCTAC ACATACTGTC 3540 

TCTCAAGCAT CTGGTGACAC TTCGCTTAAA CCTGTGCTTA GTGCAAACTC AGAGCCAGCA 3600 

TCCTCTGACC CTGCTTCTAG TGAAATGTTA TCTCCTTCAA CTCAGCTCTT ATTTTATGAG 3660 

ACCTCAGCTT CTTTTAGTAC TGAAGTATTG CTACAACCTT CCTTTCAGGC TTCTGA7GTT 372 0 

GACACCTTGC TTAAAACTGT TCTTCCAGCT GTGCCCAGTG ATCCAATATT GGTTGAAACC 3780 

CCCAAAGTTG ATAAAATTAG TTCTACAATG TTGCATCTCA TTGTATCAAA TTCTGCTTCA 3840 

AGTGAAAACA TGCTGCACTC TACATCTGTA CCAGTTTTTG ATGTGTCGCC TACTTCTCAT 3900 

ATGCACTCTG CTTCACTTCA AGGTTTGACC ATTTCCTATG CAAGTGAGAA ATATGAACCA 3960 

GTTTTGTTAA AAAGTGAAAG TTCCCACCAA GTGGTACCTT CTT7GTACAG TAATGATGAG 4020 

TTGTTCCAAA CGGCCAATTT GGAGATTAAC CAGGCCCATC CCCCAAAAGG AAGGCATGTA 4080 

TTTGCTACAC CTGTTTTATC AATTGATGAA CCATTAAATA CACTAATAAA TAAGCTTATA 4140 

CATTCCGATG AAATTTTAAC CTCCACCAAA AGTTCTGTTA CTGGTAAGGT ATTTGCTGGT 4200 

ATTCCAACAG TTGCTTCTGA TACATTTGTA TCTACTGATC ATTCTGTTCC TATAGGAAAT 42 60 

GGGCATGTTG CCATTACAGC TGTTTCTCCC CACAGAGATG GTTCTGTAAC CTCAACAAAG 4320 

TTGCTGTTTC CTTCTAAGGC AACTTCTGAG CTGAGT CAT A GTGCCAAATC TGATGCCGGT 4380 

TTAGTGGGTG GTGGTGAAGA TGGTGACACT GATGATGATG GTGATGATGA TGATGACAGA 44 4 0 

GATAGTGATG GCTTATCCAT TCATAAGTGT ATGTCATGCT CATCCTATAG AGAATCACAG 4S00 

GAAAAGGTAA TGAATGATTC AGACACCCAC GAAAACAGTC TTATGGATCA GAATAATCCA 4560 

ATCTCATACT CACTATCTGA GAATTCTGAA GAAGATAATA GAGTCACAAG TGTATCCTCA 4620 

GACAGTCAAA CTGGTATGGA CAGAAGTCCT GGTAAATCAC CATCAGCAAA TGGGCTATCC 4680 

CAAAAGCACA ATGATGGAAA AGAGGAAAAT GACATTCAGA CTGGTAGTGC TCTGCTTCCT 4740 

CTCAGCCCTG AATCTAAAGC ATGGGCAGTT CTGACAAGTG ATGAAGAAAG TGGATCAGGG 4800 

CAAGGTACCT CAGATAGCCT TAATGAGAAT GAGACTTCCA CAGATTTCAG TTTTGCAGAC 4860 

ACTAATGAAA AAGATGCTGA TGGGATCCTG GCAGCAGGTG ACTCAGAAAT AACTCCTGGA 4920 

TTCCCACAGT CCCCAACATC ATCTGTTACT AGCGAGAACT CAGAAGTGTT CCACGTTTCA 4980 

GAGGCAGAGG CCAGTAATAG TAGCCATGAG TCTCGTATTG GTCTAGCTGA GGGGTTGGAA 5040 

TCCGAGAAGA AGGCAGTTAT ACCCCTTGTG ATCGTGTCAG CCCTGAC7TT TATCTGTCTA 510 0 

GTGGTTCTTG TGGGTATTCT CATCTACTGG AGGAAATGCT TCCAGACTGC ACACTTTTAC 516 0 

TTAGAGGACA GTACATCCCC TAGAGTTATA TCCACACCTC CAACACC7AT CTTTCCAATT 522 0 

TCAGATGATG TCGGAGCAAT TCCAATAAAG CACTTTCCAA AGCATGT7GC AGATTTACAT 52 8 0 

GCAAGTAGTG GGTTTACTGA AGAATTTGAG ACACTGAAAG AGTTTTACCA GGAAGTGCAG 5340 

AGCTGTACTG TTGACTTAGG TATTACAGCA GACAGCTCCA ACCACCCAGA CAACAAGCAC 5400 

AAGAATCGAT ACATAAATAT CGTTGCCTAT GATCATAGCA GGGTTAAGCT AGCACAGCTT 5460 

GCTGAAAAGG ATGGCAAACT GACTGATTAT ATCAATGCCA ATTATGTTGA TGGCTACAAC 5 52 0 

AGACCAAAAG CTTATATTGC TGCCCAAGGC CCACTGAAAT CCACAGCTGA AGATTTCTGG 5580 

AGAATGATAT GGGAACATAA TGTGGAAGTT ATTGTCATGA TAACAAACCT CGTGGAGAAA 5640 

GGAAGGAGAA AATGTGATCA GTACTGGCCT GCCGATGGGA GTGAGGAGTA CGGGAACTTT 5700 

CTGGTCACTC AGAAGAGTGT GCAAGTGCTT GCCTATTATA CTGTGAGGAA TTTTACTCTA 5760 

AGAAACACAA AAATAAAAAA GGGCTCCCAG AAAGGAAGAC CCAGTGGACG TGTGGTCACA 582 0 

CAGTATCACT ACACGCAGTG GCCTGACATG GGAGTACCAG AGTACTCCCT GCCAGTGCTG 5880 

ACCTTTGTGA GAAAGGCAGC CTATGCCAAG CGCCATGCAG TGGGGCCTGT TGTCGTCCAC 5940 

TGCAGTGCTG GAGTTGGAAG AACAGGCACA TATATTGTGC TAGACAGTAT GTTGCAGCAG 6000 

ATTCAACACG AAGGAACTGT CAACATATTT GGCTTCTTAA AACACATCCG TTCACAAAGA 6060 

AATTATTTGG TACAAACTGA GGAGCAATAT GTCTTCATTC ATGATACACT GGTTGAGGCC 6120 

ATACTTAGTA AAGAAACTGA GGTGCTGGAC AGTCATATTC ATGCCTATGT TAATGCACTC 6180 

CTCATTCCTG GACCAGCAGG CAAAACAAAG CTAGAGAAAC AATTCCAGCT CCTGAGCCAG 6240 

TCAAATATAC AGCAGAGTGA CTATTCTGCA GCCCTAAAGC AATGCAACAG GGAAAAGAAT 6300 

CGAACTTCTT CTATCATCCC TGTGGAAAGA TCAAGGGTTG GCATTTCATC CCTGAGTGGA 63 60 

GAAGGCACAG ACTACATCAA TGCCTCCTAT ATCATGGGCT ATTACCAGAG CAATGAATTC 6420 

ATCATTACCC AGCACCCTCT CCTTCATACC ATCAAGGATT TC7GGAGGAT GATATGGGAC 6480 

CATAATGCCC AACTGGTGGT TATGATTCCT GATGGCCAAA ACATGGCAGA AGATGAATTT 654 0 

GTTTACTGGC CAAATAAAGA TGAGCCTATA AATTGTGAGA GCTTTAAGGT CACTCTTATG 6600 

GCTGAAGAAC ACAAATGTCT ATCTAATGAG GAAAAACTTA TAATTCAGGA CTTTATCTTA 6660 

GAAGCTACAC AGGATGATTA TGTACTTGAA GTGAGGCACT TTCAGTGTCC TAAATGGCCA 6720 

AATCCAGATA GCCCCATTAG TAAAACTTTT GAACTTATAA GTGTTATAAA AGAAGAAGCT 6780 

GCCAATAGGG ATGGGCCTAT GATTGTTCAT GATGAGCATG GAGGAGTGAC GGCAGGAACT 6840 

TTCTGTGCTC TGACAACCCT TATGCACCAA CTAGAAAAAG AAAATTCCGT GGATGTTTAC 6900 

CAGGTAGCCA AGATGATCAA TCTGATGAGG CCAGGAGTCT TTGCTGACAT TGAGCAGTAT 6960 

CAGTTTCTCT ACAAAGTGAT CCTCAGCCTT GTGAG CACAA GGCAGGAAGA GAATCCATCC 7020 

ACCTCTCTGG ACAGTAATGG TGCAGCATTG CCTGATGGAA ATATAGCTGA GAGCTTAGAG 70 80 

TCTTTAGTTT AACACAGAAA GGGGTGGGGG GACTCACATC TGAGCATTGT TTTCCTCTTC 7140 

CTAAAATTAG GCAGGAAAAT CAGTCTAGTT CTGTTATCTG TTGATTTCCC ATCACCTGAC 7200 

AGTAACTTTC ATGACATAGG ATTCTGCCGC CAAATTTATA TCATTAACAA TGTGTGCCTT 7260 

TTTGCAAGAC TTGTAATTTA CTTATTATGT TTGAACTAAA ATGATTGAAT TTTACAGTAT 7 320 

TTCTAAGAAT GGAATTGTGG TATTTTTTTC TGTATTGATT TTAACAGAAA ATTTCAATTT 7380 

ATAGAGGTTA GGAATTCCAA ACTACAGAAA ATGTTTGTTT TTAGTGTCAA ATTTTTAGCT 7440 

GTATTTGTAG CAATTATCAG GTTTGCTAGA AATATAACTT TTAATACAGT AGCCTGTAAA 7500 

TAAAACACTC TTCCATATGA TATTCAACAT TTTACAACTG CAGTATTCAC CTAAAGTAGA 7560 

AATAATCTGT TACTTATTGT AAATACTGCC CTAGTGTCTC CATGGACCAA ATTTATATTT 7620 

ATAATTGTAG ATTTTTATAT TTTACTACTG AGTCAAGTTT TCTAGTTCTG TGTAATTGTT 7680 

TAGTTTAATG ACGTAGTTCA TTAGCTGGTC TTACTCTACC AGTTTTCTGA CATTGTATTG 7740 
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TGTTACCTAA GTCATTAACT TTGTTTCAGC ATGTAATTTT AACTTTTGTG GAAAATAGAA 7 80 0 

ATACCTTCAT TTTGAAAGAA GTTTTTATGA GAATAACACC TTACCAAACA TTGTTCAAAT 7860 

GGTTTTTATC CAAGGAATTG CAAAAATAAA TATAAATATT GCCATTAAAA AAAAAAAAAA 7920 
AAAAAAAAAA A 

Seq ID 



PCT/US02/12476 



35 
40 



MEILKRPLAC 
QSPINIDEDL 
FKASKITFHW 
ILPEVGTEEN LDPKAIIDGV 
TDTVDWIVFK DTVSISESQL 
TGKEEIHEAV CSSEPENVQA 
HEFLTDGYQD LGAILNNLLP 
EI IKE EEEGKDIEEG 



WANGYYRQOR 
KFQGWDKTSL 
EHSLEGQKFP 
ESVSRFGKQA 



I I 
KLVEEIGWSY TGALNQXNWG KKYPTCNSPK 
ENTFIHNTGK TVEINLTNDY R 
LEMQIYCFDA DRFSSFEEAV 
ALDPFILLNL LPNSTDKYYI 
LQNNFREQQY 



RSPTRGSEFS 
GSKTVLRSPH 
ENISQGYIFS 
TAQPDVGSGR 
TEVTPHAFTP 
LNTTPAASSS 
ILPQVTSATE 
KTLMFSQVEP 



NMSYVLQIVA 
NSTSQPVTKL 
VLIPESARNA 



TNQIRKKEPQ 
ESLLTSPKLD 



SDQLIVDMPT 



3 PVYNGETPLQ 



MNLSGTAESL 
SENPETITYD 
ESFLQTNYTE 
SSRQQDLVST 
DSALHATPVF 
SDKVPLHASL PVAGGDLLLE PSLAQYSDVL 
SSGPEPSYAL 
ASLLQPTHAL 
LSKSEIIYGN 



QTVTELPPHT VEGTSASLND 



QGPSVTDLEM 
PSYSSEVFPL 
PFSEASFSSE 
STTHAASETL 



PHYSTFAYFP 
VTPLLLDNQI 
LFRHLHTVSQ 



CMSCSSYRES 
PGKSPSANGL 
NETSTDFSFA 



ADSSNHPDNK 
GPLKSTAEDF 
LAYYTVRMFT 
KRHAVGPVW 



SISSTKGMFP 
ASSDPASSEM 
TPKVDKISST 
PVLLKSESSH 
IHSDEILTST 
KLLFPSKATS 



LSPSTQLLFY 
MLHLIVSNSA 
QWPSLYSND 
KSSVTGKVFA 
ELSHSAKSDA 



SGDGEWSGAS 
ETELQIPSFN 
FDHEISQVPE 
ETSASFSTEV 



SDSEFLLPDT 
EMVYPSSSTV 
NNFSVQPTHT 
LLQPSFQASD 
VPVFDVSPTS 
NQAHPPKGRH 
VSTDHSVPIG 
TDDDGDDDDD 



VSQASGDTSL 
VDTLLKTVLP 
HMHSASLQGL 



NGHVAITAVS 



SQKH11DGKEE NDIQTGSALL 
DTNEKDADGI LAAGDSEITP 
ESEKKAVIPL VIVSALTFIC 
KHFPKHVADL 
TNIVA YDHSRVKLAQ 
VIVMITNLVE 
QKGRPSGRW 
TYIVLDSMLQ 



AALKQCNREK 
TIKDFWRMIW 
EEKLIIQDFI 
HDEHGGVTAG 
LVSTRQEENP 



DHNAOLWMI PDGQNMAEDE 
LEATQDDYVL EVRHFQCPKW 
TFCALTTLMH QLEKENSVDV 
STSLDSNGAA 



LAEKDGKLTD 
KGRRKCDQYW 
TQYHYTQWPD 
QIQUEGTVNI 
LLIPGPAGKT 
GEGTDYINAS 
FVYWPNKDEP 
PNPDSPISKT 
YQVAKMINLK 



SEAEASNSSH 
WRKCFQTAHF YLEDSTSPRV 
ETLKEFYQEV QSCTVDLGIT 
YINANYVDGY NRPK.AYIAAQ 
PADGSEEYGN FLVTQKSVQV 



FGFLKHIRSQ RNYLVQTEEQ 
KLEKQFQLLS QSNIQQSDYS 
YIMGYYQSNE FI ITQHPLLH 
INCESFKVTL fAEEHKCLSN 
FSLISVIKEE AANHDGPHIV 
RPGVFADIEQ YQFLYKVILS 



CCGGGCAGGT 
AGGGGCGCAG 
AGAAGATGAA 
GTGTGAGGGA 
GGAGAACTCG 



GAAAGTACCA 



CCCGTGTGGC 
ACGAGTCTTC 
AAGTTGGGCC 
TCATCCTGTC 

TGTTGTTAGT 
CTTGGGCATT 
TTAAGAAGAT 
TTTGCTCCAA 
GAGGACCCGT 
GCTTCCTGGG 
TCACAGCATA 
ATGAAGTTCT 
AGAGTGTTCA 



GGCTCATGCT 
GAATTCTGAT 
GGATATCGAC 
GAGAACCAGC 
ACCGTTGGAA 
TGCCTCCATG 
TCATGGCTTG 
CAATGCTGGG 
CCACAAGAAG 
TGACGTGAAC 
AGACGCTGCT 
CATCGTGTGC 
ACACCTCTTG 
GCTGGGCCTC 
GAATTACCGA 



CGGGAGCGTG 
ATAGGAAAAG 



CAGT CTGTGA 



1GCGGTT GTCCTGGAGC 



CATTCTCAGC 
AGTGCTCTGA 
CTTTTTTCCT 
GGGGAGCTCT 
TGCAGAAGAC 
TCCCTGCGAA 
CTGATGATCA 
GAGTATACCC 
CTCCTGACGG 
ACCGGTGTCC 
AAGAACATTA 



CGCACAGAGA 
CCTTGGAAAC 
TCAGAATCCT 
AGCCCATCCG 
GTATGACTTT 
CAATGGAAGA 
TAGAGAGACT 



CCCCAGTCCT GGGTATAGAA 



AGCAGCCCGA 
GGATGAGGAG 
GACTACTTCC 
TTCGTGGCTT 



CGCAGCTGGC 



AAATCGTGCG 
GCTTGCGGGG 
AAGAGAAATC 



GTGGCAAGAA 
GATCTTCTGC 
TGGCTTCAGT 
GTCTAACCTG 
GTCTTGGTCG 
GGCCATCCTA 



TGTTGCCATC 
ATCAGCTGTT 
TTT CAGGAG A 



TTTATCCTCT 



AAAAATCCGC 



GACCCTGGGC 
CATGACTTTT 
GGCTGTTGAC 
ACCAGCCAGT 
CEACTCCAGT 
TTCCAGGGGC 
GGCAGAGCAG 
AGAAGGCAAG 



AAATTTATCA 
GAGGAGGAGC 
GCTCCCATTG 
TTCGATCTGA 
GCTTTGAAAG 



TTTATAATGT 
TTTACCCAGC 
CCGCCACGGA 



AAAGGCCACC 



TAACACCGTT 

AGATAGAGAT 
CGCCCAAGCT 
AGGTGAGGCA 
TCCTCCTGGA 
TGGGCCACCT 



CATCCCAAGG 
AAACACCAGC 
TCTTCTCTGG 
CTGTCCAAGC 
GAG CTGAATG 
CGCACCAGGC 
GGACCAGCCT 
CAGTACAGCT 
CTTGCACTGA 
ACCATGGCAT 
CTCATCAACA 
CTGCTGGCTG 
GGACCAACAG 
GCATCACGGC 
CAGAAGATGA 
GCATTTTCTC 
GGGTACTTCC 
GTGACCTTCT 
GTGGTGACAG 
TCCCTCTCAG 
GTTCACATGA 
ACCTTGGCAT 
ATGAAAAAAG 
ACTGAGCATC 



GCGCTTACAG AGGACACTGC 



CGTTGGCAGC 
AATTATTCTG 
AATGATGTTT 
TGAACGTGTC 
CTGGGTCAAA 
GGAAAAAGCC 
TGCCAGCGTG 
GGCTTTCACA 
TTCAGTAAAG 
AATGGAAGAG 



GACCCCCAAA 
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ACAGCATCGA TCTGGAGATC CAAGAGGGTA AACTGGTTGG AATCTC-CGGC AGTGTGGGAA 192 0 

GTGGAAAAAC CTCTCTCATT TCAGCCATTT TAGGCCAGAT GACGCITCTA GAGGGCAGCA 1980 

TTGCAATCAG TGGAACCTTC GCTTATGTGG CCCAGCAGGC CTGGATCCTC AATGCTACTC 2040 

TGAGAGACAA CATCCTGTTT GGGAAGGAAT ATGATGAAGA AAGATACAAC TCTGTGCTGA 2100 

ACAGCTGCTG CCTGAGGCCT GACCTGGCCA TTCTTCCCAG CAGCGACCTG ACGGAGATTG 2160 

GAGAGCGAGG AGCCAACCTG AGCGGTGGGC AGCGCCAGAG GATCAGCCTT GCCCGGGCCT 2220 

TGTATAGTGA CAGGAGCATC TACATCCTGG ACGACCCCCT CAGTGCCTTA GATGCCCATG 2280 

TGGGCAACCA CATCTTCAAT AGTGCTATCC GGAAACATCT CAAGTCCAAG ACAGTTCTGT 2340 

TTGTTACCCA CCAGTTACAG TACCTGGTTG ACTGTGATGA AGTGATCTTC ATGAAAGAGG 24 00 

GCTGTATTAC GGAAAGAGGC ACCCATGAGG AACTGATGAA TTTAAATGGT GACTATGCTA 2460 

CCATTTTTAA TAACCTGTTG CTGGGAGAGA CACCGCCAGT TGAGATCAAT TCAAAAAAGG 2S20 

AAACCAGTGG TTCACAGAAG AAGTCACAAG ACAAGGGTCC TAAAACAGGA TCAGTAAAGA 2S80 



GTTCAGTGCC CTGGTCAGTA TATGGTGTCT ACATCCAGGC TGCTGGGGGC CCCTTGGCAT 2700 

TCCTGGTTAT TATGGCCCTT TTCATGCTGA ATGTAGGCAG CACCGCCTTC AGCACCTGGT 2760 

GGTTGAGTTA CTGGATCAAG CAAGGAAGCG GGAACACCAC TGTGACTCGA GGGAACGAGA 282 0 

CCTCGGTGAG TGACAGCATG AAGGACAATC CTCATATGCA GTACTATGCC AGCATCTACG 2880 

CCCTC1CCAT GGCAGTCATG CTGATCCTGA AAGCCATTCG AGGAGTTGTC TTTGTCAAGG 2940 

GCACGCTGCG AGCTTCCTCC CGGCTGCATG ACGAGCTTTT CCGAAGGATC CTTCGAAGCC 3000 

CTATGAAGTT TTTTGACACG ACCCCCACAG GGAGGATTCT CAACAGGTTT TCCAAAGACA 3060 

TGGATGAAGT TGACGTGCGG CTGCCGTTCC AGGCCGAGAT GTTCATCCAG AACGTTATCC 3120 

TGGTGTTCTT CTGTGTGGGA ATGATCGCAG GAGTCTTCCC GTGGTTCCTT GTGGCAGTGG 3180 

GGCCCCTTGT CATCCTCTTT TCAGTCCTGC ACATTGTCTC CAGGGICCTG ATTCGGGAGC 3240 

TGAAGCGTCT GGACAATATC ACGCAGTCAC CTTTCCTCTC CCACATCACG TCCAGCATAC 3300 

AGGGCCTTGC CACCATCCAC GCCTACAATA AAGGGCAGGA GTTTCTGCAC AGATACCAGG 3360 

AGCTGCTGGA TGACAACCAA GCTCCTTTTT TTTTGTTTAC GTGTGCGATG CGGTGGCTGG 342 0 

CTGTGCGGCT GGACCTCATC AGCATCGCCC TCATCACCAC CACGGGGCTG ATGATCGTTC 3480 

TTATGCACGG GCAGATTCCC CCAGCCTATG CGGGTCTCGC CATCTCTTAT GCTGTCCAGT 3540 

TAACGGGGCT GTTCCAGTTT ACGGTCAGAC TGGCATCTGA GACAGAAGCT CGATTCACCT 3600 

CGGTGGAGAG GATCAATCAC TACATTAAGA CTCTGTCCTT GGAAGCACCT GCCAGAATTA 3660 

AGAACAAGGC TCCCTCCCCT GACTGGCCCC AGGAGGGAGA GGTGACCTTT GAGAACGCAG 372 0 

AGATGAGGTA CCGAGAAAAC CTCCCTCTTG TCCTAAAGAA AGTATCCTTC ACGATCAAAC 3780 

CTAAAGAGAA GATTGGCATT GTGGGGCGGA CAGGATCAGG GAAGTCCTCG CTGGGGATGG 3840 

CCCTCTTCCG TCTGGTGGAG TTATCTGGAG GCTGCATCAA GATTGATGGA GTGAGAATCA 3900 

GTGATATTGG CCTTGCCGAC CTCCGAAGCA AACTCTCTAT CATTCCTCAA GAGCCGGTGC 3960 

TGTTCAGTGG CACTGTCAGA TCAAATTTGG ACCCCTTCAA CCAGTACACT GAAGACCAGA 4020 

TTTGGGATGC CCTGGAGAGG ACACACATGA AAGAATGTAT TGCTCAGCTA CCTCTGAAAC 4080 

TTGAATCTGA AGTGATGGAG AATGGGGATA ACTTCTCAGT GGGGGAACGG CAGCTCTTGT 4140 
GCATAGCTAG AGCCCTGCTC CGCCACTGTA A 
CCATGGACAC AGAGACAGAC TTATTGATTC AAGAGACCAT C 
GTACCATGCT GACCATTGCC CATCGCCTGC ACACGGTTCT AGGCTCCGAT A 
r 1 ( CCCA GGGACAGGTG GTGGAGTTTG ACACCCCATC GGTCCTTCTG TCCAACGACA 
GTTCCCGATT CTATGCCATG TTTGCTGCTG CAGAGAACAA GGTCGCTGTC AAGGGCTGAC 
TCCTCCCTGT TGACGAAGTC TCTTTTCTTT AGAGCATTGC CATTCCCTGC CTGGGGCGGG 
CCCCTCATCG CGTCCTCCTA CCGAAACCTT GCCTTTCTCG ATTTTATCTT TCGCACAGCA 
GTTCCGGATT GGCTTGTGTG TTTCACTTTT AGGGAGAGTC ATATTTTGAT T. 
ATTCCATATT CATGTAAACA AAATTTAGTT TTTGTTCTTA ATTGCACTCT A 
GGGAACCGTT ATTATAATTG TATCAGAGGC CTATAATGAA G 
TCTATATATA ATTCTGTACA TAGCCTf 

TATTAAAATA AGCACTGTGC TAATAACAGT GCATATTCCT TTCTATCATT TTTGTACAGT 
TTGCTGTACT AGAGATCTGG TTTTGCTATT AGACTGTAGG AAGAGTMCA T 
CTCTAGCTGG TGGTTTCACG GTGCCAGGTT T 
ATAGTGGGCC CTCCGACAGC CCCCTCTGCC GCCTCCCCAC A 
GAGACGGGTG GGCGGCTGGA GACCATGCAG AGCGCCGTGA GTTCTCAGGG C 

CTGTCCTGGT GTCACTTACT GTTTCTGTCA GGAGAGCAGC GGGGCGAAGC CCAGGCCCCT 5160 

TTTCACTCCC TCCATCAAGA ATGGGG AT CA CAGAGACATT CCTCCGAGCC GGGGAGTTTC 522 0 

TTTCCTGCCT TCTTCTTTTT GCTGTTGTTT CTAAACAAGA ATCAGTCTAT CCACAGAGAG 5280 

TCCCACTGCC TCAGGTTCCT ATGGCTGGCC ACTGCACAGA GCTCTCCAGC TCCAAGACCT 5340 

GTTGGTTCCA AGCCCTGGAG CCAACTGCTG CTTTTTGAGG TGGCACTTTT TCATTTGCCT 5400 

ATTCCCACAC CTCCACAGTT CAGTGGCAGG GCTCAGGATT TCGTGGGTCT GTTTTCCTTT 54 60 

CTCACCGCAG TCGTCGCACA GTCTCTCTCT CTCTCTCCCC TCAAAGTCTG CAACTTTAAG 5520 

CAGCTCTTGC TAATCAGTGT CTCACACTGG CGTAGAAGTT TTTGTACTGT AAAGAGACCT 5580 

ACCTCAGGTT GCTGGTTGCT GTGTGGTTTG GTGTGTTCCC GCAAACCCCC TTTGTGCTGT 5640 

GGGGCTGGTA GCTCAGGTGG GCGTGGTCAC TGCTGTCATC AGTTGAATGG TCAGCGTTGC 5700 

ATGTCGTGAC CAACTAGACA TTCTGTCGCC TTAGCATGTT TGCTGAACAC CTTGTGGAAG 57 60 

CAAAAATCTG AAAATGTGAA TAAAAT T ATT TTGGATTTTG TAAAAAAAAA AAAAAAAAAA 5820 



Seq ID NO: 585 Protein sequence 
Protein Accession ft: NP_005679.1 

i i 1 T f T T 

MKDIDIGKEY IIPSPGYRSV RERTSTSGTH RDREDSKFRR TRPLECQDAL ETAARAEGLE 
LDASMHSQLR ILDEEHPKGK YHHGLSALKP IRTTSKHQHP VDNAGLFSCM TFSWLSSLAR 
VAHKKGELSM EDVWSLSKHE SSDVNCRRLE RLWQEELNEV GPDAASLRRV VWIFCRTRLI 
LSIVCLMITQ LAGFSGPAFM VKHLLEYTQA TESNLQYSLL LVLGLLLTEI VRSWSLALTW 
ALNYRTGVRL RGAILTMAFK KILKLKNIKE KSLGELINIC SNDGQRMFEA AAVGSLLAGG 
PWAILGMIY NVIILGPTGF LGSAVFILFY PAMMFA3RLT AYFRRKCVAA TDSRVQKMNE 
VLTYIKFIKM YAWVKAFSQS VQKIREEERR ILEKAGYFQG ITVGVAE ,r ^ " 
HMTLGFDLTA AQAFTWTVF NSMTFALKVT PFSVKSLSEA SVAVDRFKSL FLMEEVHMIK 
NKPASPHIKI EMKNATLAWD SSHSSIQNSP KLTPKHKKDK RASRGKKEKV RQLQRTEHQA 
VLAEQKGHLL LDSDERPSPE EEEGKH1HLG HLRLQRTLHS IDLEIQEGKL VGICGSVGSG 
KTSLISAILG QMTLLEGSIA ISGTFAYVAQ QAWILN 1 L l"l r L j E 1 
E RGANLSGGQR QRISLARALY S 
K HLKSKTVLFV THQLQYLVDC DEVIFMKEGC 1 



FNNLLLGETP PVEINSKKET SGSQKKSQDK GPKTGSVKKE KAVKPEEGQL VQLEEKGQGS 



413 



30 
35 



WO 02/086443 

I QAAGGPLAPL 
H MQYYASIYAL 
R ILNRFSKDMD 
I VSRVLIRELK 
LDDNQAPFFL FTCAMRWLAV 
GLFQFTVRLA SETEARFTSV 
RYRENLPLVL KKVSFTIKPK 
IGLADLRSKL SIIPQEPVLF 
SEVMENGDNF SVGERQLLCI 
MLTIAHRLHT VLGSDRIMVL 



VIMALFMLNV GSTAFSTWWL 
SMAVMLILKA IRGWFVKGT 
EVDVRLPFQA EMFIQNVILV 
RLDNITQSPF LSHITSSIQG 
TTTGLMIVLM 



SGKSSLGMAL 
FNQYTEDQIW 
jDEATAAM 



SYWIKQGSGN TTVTRGMETS 900 
LFRRILRSPM 9S0 
FPWFLVAVGP 
QEFLHRYQEL 
LAISYAVQLT 
GEVTFENAEM 



FFCVGMIAGV 



AGCAGGGGGC 
GACGGGCGAT 



AAGGGCCTCG 
GCTGAATGGA 
CCTCGCCATG 
GGATGCCCCA 
CATACTGACT 



GCTGTGTGTA 
GGCAGAGGCT 
GCTGATGGCC 
GGAGAGGCGG 



AQGQWEFDT PSVLLSNDSS 



C CGAG AATAC G 



HGQIPPAYAG 
KAPSPDWPQE 
FRLVELSGGC 
DAL3RTHMKE 
DTETDLLIQE 
RFYAMFAAAE 



CIAQLPLKLE 
TIRBAFADCT 
HKVAVKG 



r GCGGGGCCAG G 



AT CCG ACTG A 
CAGCTTTCCC 
CCCTCAGGGC 
CTAGGGAATG 
GGAGGAGGAC 



CACCCATGGA 
TGCCAGGGGT 
CTGCTGCAGA 



AGAGGCGCTA 
GTCCCAGCAC 
GGCTTACATG 



7 Protein sequence 



3 CGGCAGAGGT CCCCOGGGCG CAGGGGCAGC 240 



CCACCGCCAA CTGCAGCTCT CCATCAGCTC 
GATCACGCAG TGCTTTCTGC CCGTGTTTTT 
AGCCCAGCCT GGCGCCCCTT CCTAGGTCAT 



MQAEGRGTGG STGDADGPGG PGIPDGPGGN AGGPGEAGAT GGRGPRGAGA ARASGPGGGA 
PRGPHGGAAS GLNGCCRCGA RGPESRLLEF YLAMPFATPM EAELARRSLA QDAPPLPVPG 
VLLKEFTV3G NILTIRLTAA DHRQLQLSIS SCLQQLSLLM WITQCPLPVF LAQPPSGQRR 
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I 



3 GCACAGGGGG 
C CAGGGGGCAA 
3 GCGCAGGGGC 
3 GTGCGGCTTC 
3 TGCTTCAGTT 



GTGTTTTTGG 
TAGGTCATGC 
GCCTGATTGT 
CTGAGCTA 



CTCAGGCTCC 
CTCCTCCCCT 
TTGTCGCTGG 



TTCGACGGGC 
TGCTGGCGGC 
AGCAAGGGCC 
TGCGCAGGAT 
CCGACTGACT 
GCTTTCCCTG 
CTCAGGGCAG 
AGGGAATGGT 
AGGAGGACGG 



CCGGGCAGAG GCTCCGGAGC CATGCAGGCC 
GATGCTGATG GCCCAGGAGG CCCTGGCATT 
GCAGGAGAGG CGGGTGCCAC GGGCGGCAGA 



GGAAGGTGCC CCTGCGGGGC CAGGAGGCCG 
GCTGCAGACC ACCGCCAACT GCAGCTCTCC 
TTGATGTGGA TCACGCAGTG CTTTCTGCCC 
AGGCGCTAAG CCCAGCCTGG CGCCCCTTCC 
CCCAGCACGA GTGGCCAGTT CATTGTGGGG 
CTTACATGTT TGTTTCTGTA GAAAATAAAG 



MQAEGQGTGG STGDADGPGG PGIPDGPGGN AGGPGEAGAT GGRGPRGAGA ARASGPRGGA 
PRGPHGGAAS AQDGRCPCGA RRPDSRLLQF RLTAADHRQL QLSISSCLQQ LSLLMWITQC 
FLPVFLAQAP SGQRR 

Seq ID NO: 590 DNA sequence 
Nucleic Acid Accession #: NM_005S62.1 
Coding sequence: 90.. 3671 



ACAGCGGAGC 
AGACAGAGAC 
GCTTCTCGCT 
ATGGGAAGTC 



GCAGAGTGAG 
TGAGCGGCCC 
CCTCCTGCCC 

CAACTGCAAT 



CTCTTAGTGC 
ACCAGAGACT 

ACGCGGGCCG CTGTGTCTGC 
CAGGTTACTA 
GGCATTCAGC 
TTCATCAAGA 



CAGCTGCCGC 
TGTTGATGGC 
GCGCCATCAA 



21 

I 

AACCACCAAC 
GGCACCGCCA 
GCAGCCCGGG 
ATCTTTGATC 
GACAACACTG 
AGGGACCGCT 
AACTCTGGAC 
CCAGGCTTCC 
AAGTGTGACT 
AAGCCAGCTG 
GGGGGGAACC 
AGCTCTGCAG 
TGGAAGGCTG 
GATGTGTTTA 



31 

I 

" CGAGGCGCCG 
TGCCTGCGCT 
CCACCTCCAG 
GGGAACTTCA 
ATGGCATTCA 
GTTTGCCCTG 
GGTGCAGCTG 
ACATGCTCAC 



41 

I 

GGCAGCGACC 
CTGGCTGGGC 
GAGGGAAGTC 
CAGACAAACT 
CTGGGAGAAG 
CAATTGTAAC 
TAAACCAGGT 



I 

" CCTGCAGCGG 
TGCTGCCTCT 
TGTGATTGCA 
GGTAATGGAT 
TGCAAGAATG 
TCCAAAGGTT 
GTGACAGGAG 
TGCACCCAAG 



TTACTGGAGA ACGCTGTGAT AGGTGTCGAT 
CTGAGGGCTG TACCCAGTGT TTCTGCTATG 
AATACAGTGT CCATAAGATC ACCTCTACCT 
TCCAACGAAA TGGGTCTCCT GCAAAGCTCC 
GCTCAGCCCA ACGACTA3AC CCTGTCTATT 



414 
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TTGTGGCTCC TGCCAAATTT CTTGGGAATC AACAGGTGAG CTATGGGCAA AGCCTGTCCT 900 

TTGACTACCG TGTGGACAGA GGAGGCAGAC ACCCATCTGC CCATGATGTG ATTCTGGAAG 960 

GTGCTGGTCT ACGG AT CACA GCTCCCTTGA TGCCACTTGG CAAGACACTG CCTTGTGGGC 1020 

TCACCAAGAC TTACACATTC AGGTTAAATG AGCATCCAAG CAATAATTGG AGCCCCCAGC 1080 

TGAGTTACTT TGAGTATCGA AGGTTACTGC GGAATCTCAC AGCCCTCCGC ATCCGAGCTA 1140 

CATATGGAGA ATACAGTACT GGGTACATTG ACAATGTGAC CCTGATTTCA GCCCGCCCTG 12 00 

TCTCTGGAGC CCCAGCACCC TGGGTTGAAC AGTGTATATG TCCTGTTGGG TACAAGGGGC 12 60 

' AATTCTGCCA GGATTGTGCT TCTGGCTACA AGAGAGATTC AGCGAGACTG GGGCCTTTTG 1320 

GCACCTGTAT TCCTTGTAAC TGTCAAGGGG GAGGGGCCTG TGATCCAGAC ACAGGAGATT 13B0 

GTTATTCAGG GGATGAGAAT CCTGACATTG AGTGTGCTGA CTGCCCAATT GGTTTCTACA 1440 

ACGATCCGCA CGACCCCCGC AGCTGCAAGC CATGTCCCTG ICATAACGGG TTCAGCTGCT IS 00 

CAGTGATGCC GGAGACGGAG GAGGTGGTGT GCAATAACTG CCCTCCCGGG GTCACCGGTG 1S60 

CCCGCTGTGA GCTCTGTGCT GATGGCTACT TTGGGGACCC CTTTG3TGAA CATGGCCCAG 1620 

TGAGGCCTTG TCAGCCCTGT CAATGCAACA ACAATGTGGA CCCCAGTGCC TCTGGGAATT 1680 

GTGACCGGCT GACAGGCAGG TGTTTGAAGT GTATCCACAA CACAGCCGGC ATCTACTGCG 1740 

ACCAGTGCAA AG CAGG CT AC TTCGGGGACC CATTGGCTCC CAACCCAGCA GACAAGTGTC 1800 

GAGCTTGCAA CTGTAACCCC ATGGGCTCAG AGCCTGTAGG ATGTCGAAGT GATGGCACCT 1860 

GTGTTTGCAA GCCAGGATTT GGTGGCCCCA ACTGTGAGCA TGGAGCATTC AGCTGTCCAG 1920 

CTTGCTATAA TCAAGTGAAG ATTCAGATGG ATCAGTTTAT GCAGCAGCTT CAGAGAATGG 1980 

AGGCCCTGAT TTCAAAGGCT CAGGGTGGTG ATGGAGTAGT ACCTGATACA GAGCTGGAAG 2 040 

GCAGGATGCA GCAGGCTGAG CAGGCCCTTC AGGACATTCT GAGAGATGCC CAGATTTCAG 2100 

AAGGTGCTAG CAGATCCCTT GGTCTCCAGT TGGCCAAGGT GAGGAGCCAA GAGAACAGCT 2160 

ACCAGAGCCG CCTGGATGAC CTCAAGATGA CTGTGGAAAG AGTTCGGGCT CTGGGAAGTC 2220 

AGTACCAGAA CCGAGTTCGG GATACTCACA GGCTCATCAC TCAGATGCAG CTGAGCCTGG 2280 

CAGAAAGTGA AGCTTCCTTG GGAAACACTA ACATTCCTGC CTCAGACCAC TACGTGGGGC 2 340 

CAAATGGCTT TAAAAGTCTG GCTCAGGAGG CCACAAGATT AGCAGAAAGC CACGTTGAGT 2 400 

CAGCCAGTAA CATGGAGCAA CTGACAAGGG AAACTGAGGA CTATTCCAAA CAAGCCCTCT 2460 

CACTGGTGCG CAAGGCCCTG CATGAAGGAG TCGGAAGCGG AAGCGGTAGC CCGGACGGTG 2520 

CTGTGGTGCA AGGGCTTGTG GAAAAATTGG AGAAAACCAA GTCCCTGGCC CAGCAGTTGA 2580 

CAAGGGAGGC CACTCAAGCG GAAATTGAAG CAGATAGGTC TTATCAGCAC AGTCTCCGCC 2 640 

TCCTGGATTC AGTGTCTCGG CTTCAGGGAG TCAGTGATCA GTCCTTTCAG GTGGAAGAAG 2700 

T CAAACAAAAA GCGGATTCAC TCTCAACGCT GGTAACCAGG CATATGGATG 2 760 

^CAAAAG AATCTGGGAA ACTGGAAAGA AGAAGCACAG CAGCTCTTAC 2820 

rGGGAGA GAGAAATCAG ATCAGCTGCT TTCCCGIGCC AATCTTGCTA 2 880 

... Z ACAAGAAGCA CTGAGTATGG GCAATGCCAC TTTTTATGAA GTTGAGAGCA 2940 

TCCTTAAAAA CCTCAGAGAG TTTGACCTGC AGGTGGACAA CAGAAAAGCA GAAGCTGAAG 3000 

AAGCCATGAA GAGACTCTCC TACATCAGCC AGAAGGTTTC AGATGCCAGT GACAAGACCC 3060 

AGCAAGCAGA AAGAGCCCTG GGGAGCGCTG CTGCTGATGC ACAGAGGGCA AAGAATGGGG 3120 

CCGGGGAGGC CCTGGAAATC TCCAGTGAGA TTGAACAGGA GATTGGGAGT CTGAACTTGG 3180 

AAGCCAATGT GACAGCAGAT GGAGCCTTGG CCATGGAAAA GGGACTGGCC TCTCTGAAGA 3240 

GTGAGATGAG GGAAGTGGAA GGAGAGCTGG AAAGGAAGGA GCTGGAGTTT GACACGAATA 3300 

TGGATGCAGT ACAGATGGTG ATTACAGAAG CCCAGAAGGT TGATACCAGA GCCAAGAACG 3360 

CTGGGGTTAC AATCCAAGAC ACACTCAACA CATTAGACGG CCTCCTGCAT CTGATGGACC 3 42 0 

AGCCTCTCAG TGTAGATGAA GAGGGGCTGG TCTTACTGGA GCAGAAGCTT TCCCGAGCCA 3 480 

AGACCCAGAT CAACAGCCAA CTGCGGCCCA TGATGTCAGA GCTGGAAGAG AGGGCACGTC 3 540 

AGCAGAGGGG CCACCTCCAT TTGCTGGAGA CAAGCATAGA TGGGATTCTG GCTGATGTGA 3 500 
AGAACTTGGA GAACATTAGG GACAACCTGC CCCCAGGCTG CTACAATACC CT 
AGCAACAGTG AAGCTGCCAT AAATATTTCT CAACTGAGGT TCTTGGG 
GGCTCGGGAG CCATGTCATG TGAGTGGGTG GGATGGGGAC ATTTGA2 

TATGCTCAGG TCAACTGACC TGACCCCATT CCTGATCCCA TGGCCAGGTG GTTGTCTTAT 3B4 0 

TGCACCATAC TCCTTGCTTC CTGATGCTGG GCAATGAGGC AGATAGCACT GGGTGTGAGA 3900 

ATGATCAAGG ATCTGGACCC CAAAGAATAG ACTGGATGGA AAGACAAACT GCACAGGCAG 3 960 

ATGTTTGCCT CATAATAGTC GTAAGTGGAG TCCTGGAATT TGGACAAGTG CTGTTGGGAT 4 020 

ATAGTCAACT TA1TCTTTGA GTAATGTGAC TAAAGGAAAA AACTTTGACT TTGCCCAGGC 4080 

ATGAAATTCT TCCTAATGTC AGAACAGAGT GCAACCCAGT CACACTGTGG CCAGTAAAAT 4140 

ACTATTGCCT CATATTGTCC TCTGCAAGCT TCTTGCTGAT CAGAGTTCCT CCTACTTACA 4200 

ACCCAGGGTG TGAACATGTT CTCCATTTTC AAGCTGGAAG AAGTGAGCAG TGTTGGAGTG 4260 

AGGACCTGTA AGGCAGGCCC ATTCAGAGCT ATGGTGCTTG CTGGTGCCTG CCACCTTCAA 432 0 

GTTCTGGACC TGGGCATGAC ATCCTTTCTT TTAATGATGC CATGGCAACT TAGAGATTGC 4380 

ATTTTTATTA AAGCATTTCC T ACCAG CAAA GCAAATGTTG GGAAAGTATT TACTTTTTCG 4440 

GTTTCAAAGT GATAGAAAAG TGTGGCTTGG GCATTGAAAG AGGTAAAATT CTCTAGATTT 4500 

ATTAGTCCTA ATTCAATCCT ACTTTTCGAA CACCAAAAAT GATGCGCATC AATGTATTTT 4560 

ATCTTATTTT CTCAATCTCC TCTCTCTTTC CTCCACCCAT AATAAGAGAA TGTTCCTACT 4620 

CACACTTCAG CTGGGTCACA TCCATCCCTC CATTCATCCT TCCATCCATC TTTCCATCCA 4680 

TTACCTCCAT CCATCCTTCC AACATATATT TATTGAGTAC CTACTGTGTG CCAGGGGCTG 4740 

GTGGGACAGT GGTGACATAG TCTCTGCCCT CATAGAGTTG ATTGTCTAGT GAG GAAGACA 4800 

AGCATTTTTA AAAAATAAAT TTAAACTTAC AAACTTTGTT TGTCACAAGT GGTGTTTATT 4860 

GCAATAACCG CTTGGTTTGC AACCTCTTTG CTCAACAGAA CATATGTTGC AAGACCCTCC 4920 

CATGGGGGCA CTTGAGTTTT GGCAAGGCTG ACAGAGCTCT GGGTTGTGCA CATTTCTTTG 4980 

CATTCCAGCT GTCACTCTGT GCCTTTCTAC AACTGATTGC AACAGACTGT TGAGTTATGA 504 0 

TAACACCAGT GGGAATTGCT GGAGGAACCA GAGGCACTTC CACCTTGGCT GGGAAGACTA 5100 

TGGTGCTGCC TTGCTTCTGT ATTTCCTTGG ATTTTCCTGA AAGTGTTTTT AAATAAAGAA 5160 
CAATTGTTAG ATGCC 

Seq ID NO: 591 Protein sequence 
Protein Accession #> NP_00S553.1 

1 11 21 31 41 51 

MPALWLGCCL CFSLLLPAAR ATSRREVCDC NGKSRQCIFD RSLKRQTGNG FRCLNCNDNT 60 
DGIHCEKCKN GFYRHRERDR CLPCNCNSKG SLSARCDNSG RCSCKPQVTG ARCDRCLPGF 120 
HMLTDAGCTQ DQRLLDSKCD CDPAGIAGPC DAGRCVCKPA VTGERCDRCR SGYYNLDGGN 180 
PEGCTQCFCY GHSASCRSSA EYSVHKITST FHQDVDGWEu- 2RNGSF 

SSAQRLDPVY FVAPAKFLGN QQVSYGQSLS FDYRVDRGGR HPSAHDVILE C-AGLRITAPL 300 
MPLGKTLPCG LTKTYTFRLN EHPSNNWSPQ LSYFEYRRLL RNLTALRIRA TYGEYSTGYI 360 
DNVTLISARP VSGAPAPWVE QCICPVGYKG QFCQDCASGY KRDSARLGPF C-TCIPCNCQG 420 
GGACDPDTGD CYSGDENPDI ECADCPIGFY NDPHDPRSCK PCPCHNGFSC EVMPETEEW 480 



415 



RCELCADGY 
C DQCKAGYFGD 
NCEHGAFSCP ACYNQVKIQM 
QDILRDAQIS EGASRSLGLQ 
RLITQMQLSL AESEASLGNT 
ETEDYSKQAL SLVRKALHEG 
ADRSYQHSLR LLDSVSRLQG 
NWKEEAQQLL QNGKSGREKS 
QVDNRKAEAE 
IEQEIGSLNL 
AQKVDTRAKN AGVTIQDTLN 
MMSELEERAR QQRGHLHLLE 



PLAPNPADKC 
DQFMQQLQRM 
LAKVRSQENS 
NIPASDHYVG 



P VRPCQPCQCN NNVDPSASGN CDRLTGRCLK 
S EPVGCRSDGT CVCKPGFGGP 



EALISKAQGG 
YQSRLDDLKM 
PNGFKSLAQE 



DGWPDTELE C- 



VSDQSFQVEE 
DQLLSRANLA 
QKVSDASDKT 



AKRIKQKADS 
KSRAQEALSM 
QQAERALGSA 



EKTKSLAQQL 
GNATFYEVES 



TLDGLLHLMD QPLSVDEEGL V 
TSIDGILADV KNLENIRDNL P 



AGEALEISSE 
MDAVQMVITE 
KTQINSQLRP 



PCT/US02/12476 



Coding sequence: 221. ass 



GAGCAACCTC 



TGACTCCTTG 
CATCCTCCTG 
CTTGGAAGAC 
TCTTGCAGGT 
ATTCTATGAC 
TGGCTGGGCT 
CCGAAAAACA 
GAAAGACTAC 
GGACATTGAG 
GTATGGTATT 
AAACATGGCT 



11 

I 

AGCTTCTAGT 
CTTCTCCAGC 
GCCACCTTCG 
CCTGAGCCAG 
TTCATTCTCG 
AGGATTTACT 
TGGATGTCCT 
CTGAATCTGA 



AT CCAG ACT C CAGCGCCGCC CC3GGCGCGG ACCCCAACCC 
GGCGGCGCAG CGAGCAGGGC TCCCCGCCTT AACTTCCTCC 
GGAGTCCGGG TTGCCCACCT GCAAACTCTC CGCCTTCTGC 



CCTTCCTGGG 
CCTATGCCGG CGACAACATC 
GCGTGTCGCA GAGCACCGGG 
GCAGCACATT GCAAGCAACC 
CAATCTTTGT GGCCACCGTT 
AGAAGATGAG GATGGCTGTC 
TAGTTGCCAC AGCATGGTAT 
CAGTCAATGC CAGGTACGAA 



GCCATCGTCA 
GTGACCGCCC 
CAGATCCAGT 
CGTGCCTTGA 
GGCATGAAGT 
ATTGGGGGTG 



GCACTGCCCT 
AGGCCATGTA 
GCAAAGTCTT 
TGGTGGTTGG 



ACCTCTTACC 
GTGTGACACA 
ATACTATCAT 
ACAAAACAAA 
TAATCTTATT 



TATATATAGA 
CTCATTATGT 
CCATATTGAT 
CAGTCAAATA 
CTAATTTACC 



TATGTATATA 
TGATACTAGC 
GAAGATGTTT 
TCATTTACTC 
AAGGATGAAT 
CCATAATCTT 
CTCTATCTCC 
AATTTATTAC 
CCTGTTGACC 
AAATATTTGT 
TTGATTGAAT 
TCCCCATTCC 
TAATAAGGTG 
TGACAAATAT 
ATCTGCCAAA 



CAACACCAAG 
GAGGCAAAAG 
TAACATTAGG 
CAAACAAACA 
TTATCTTCTT 
GAGTAATCAT 
TACATGTTTT 
ATACTTAAAA 



TTTGGTCAGG 
CTACTTTGCT 
AAACCTGCAC 



CGATATTTCT 
TCGTTCAAGA 
CTCTCTTCAC 
GTTCCTGTCC 



AAAAACCCAT 
TCCTCAATAT 
ACTCAAATGG 
TCTATTAAAA 
TATCTCTAAA 



GTGTTAAAAT 
AGGAGG3AAG 



AACCGAAAAT 900 



TGAATCTAAC 
AAATCAGAAC 
TTCCCACACA 
CCAATTGAGT 
TTTTAAGCTA 
TTAATTGTAT 
TGGTCTGTTT 
TCTCTCTGTA 
TTGAGATAAT 



ACATTTCATA 
TTTGGAGGCA 
ATCCCTGTAC 
AGCTGCATGC 
CTTATTCATA 
TGTTTTCCCA 
GTCTGAACAA 
GCTGTAAGCA 
GATACTTAAC 



ATAGACAGTA 
ATAGGTAAAT 
GTCCTTATAT 
CCTT7GCCAC 
GCCCTTTTCA 
AAGCCCTTAT 



ACTCAGTGCT 
ATTTTACCAT 
GCTCCTTAAA 



GTATTTAATT 
ACATATGTAA 
AAGACCTAGC 
TATACTTATT 



TCTGACCCAT 



AGTGTAATTA 
AGTGCTAGAC 
AGTCACTTAA 



TAGTTTCTAA 
CATGACCAAA 
AGCACTCTTG 
GGTGTTGTAA 
CCCCTAAACT 
TCATGCGTTT 
TTTCTGGAGT 




ACACATACCT 
AAACCTACGC 
ATTCTTTCAG 
TTTCCAGTCT 
GCACTGGTGT 
AGCAAGGCAT 
CTGATCTTCC 
GTGGTTTTGT 



CTGTGTCTGA 
GTACAGAATG 
CTGGAGACCT 
TTGGCTGCTG 



GCCTTAACCA 
AAGATTCTGA 
ACAGATGTAA 
TTTTGAATCA 
TGTTAGCTGG 
CTACACAAGG 



AATTTGAAAA 
TTGCTTTTCA 
GTCTCTCAAG 
GGAAGTCTTA 
TGGGAAGAAA 
TAATAACTCA 
CAGCTGACGC 
AAAGTCAGCC 
AC CTGAG AAT 
CTTCATGATG 
TTTATGGCCC 
TTATATTCTT 
AATTTGTATA 



TAAGCTTATT 
TGATGTTGTG 
GTGCTATACT 
AATGTTTGAA 
TGATGAGACA 
TCTTCTGCAG 
TAAAAGCCTA 
TAAGGTGCTA 



TGAGCAAGAT 
CTTGGTGCTA 
GCTTCATCTG 
GGGATCCAGT 



AATAAAAAAA 
GTGAAGTAAA 
TGAGTATGGC 
CGTGTTGGTA 
TCTGTTCAGT 
GTTAGTTTGG 
ATGAGGAATT - 
GGAAGTTAAA 



TCAATCACCG 
TAAGCGGTGG 
GAGATAGAAT 
ATTGAGGAAT 
TGTTAAGAAA 
ATTGAGTGCA 
CCAATGCT7T 
AATCCAACAG 
GATGCCCTCA 
AAATGGTACT 
GGACCTAATA 



AGGTAGTGTG 
ATGTAGTGTC 
ACACACGTAC 
CAAAACCTAC 
CCACTGAACA 
GTCTATTTCC 
TGCTCTTACT 
AAGGGTGTTG 



AAAATGACCA ACGAAATTGT 
CTACCACACC TGGAAACAGA 
AAGCATTACT CTTTTTCAAT 



GATATCAG7T 
TACAATAGAA 
CCAATAGACA 
AAATTGTTTT 



TGGGTTTCTT 
CTAAACGAAT 
C7GTGGCTAA 
CAAGGGAGAT 
GAGCTCTTGC 
T CATAAT AAA 
AATTTTAGTG 
CTTTTGCCAC 
ACCAAACATT 
TTTATCCAAT 



TTAATTTAAA 



2220 



3000 
3060 
3120 
3180 
3240 
3300 
3360 



MAKAGLQLLG FILAFLGWIG AIVSTALPQW RIYSYACD'it 'TAQAMYEGI WM£ QSTG 

QIQCKVFDSL LNLSSTLQAT RALMWGILL GVIAIFVATV GMKCMKCLED DEVQKMRMAV 

IGGAI FLLAG LAILVATAWY GNRIVQEFYD PMTPVNARYE FGQALFTGWA AASLCLLGGA 

LLCCSCPRKT TSYPTPRPYP K 



416 



WO 02/086443 



PCT/US02/12476 




I 



I 



CGGCCGGTGC 



AGCGGTTCGC 
CGGACCAGCT 
TAAGAGAGCC 



GGCATGGACC 



TGGTGCAGCG 
GATCCTGAGA 
GAAGATGATG 
AAATTTGTGG 
CGAAACAAAC 
ATCCTGGTGG 
GAGGCTAAAT 
ATTCCCCTGG 
CCTAACCTCA 
CCGGTTCCTA 
AGCCACACAC 
ATCTGTTGTG 
CATTTTGCAC 
CCATTCACTG 
TTGAATGAGT 
GGCTGCCTCC 
AAGAATGAGT 
ATTGACGATG 



ACCCTTCTCC 
ACAT CACCG A 
TTGAAGCTTA 
CTCATAAAGC 
TGACGAGTTT 
GCAATCCATT 



GGAATCTGCG CCCCAGAGAG TCCCGGACGC CGCCGGTCGG 
GCAGCGACGG CCGCc 3C GA GCT '2itGCA GCGG7AGCGC 
TATGCCGGGA CCACTGTGAA CCCTGCCGCC TGCCGGAACA 
CAGCCTCTGA TAAGCTGGAC TCGGCACGCC CGCAACAAGC 
GCAAGCGCAG GGAAGGCCTC CCCGCACGGG TGC-GGGAAAG 
CAGGCACTCG GGCTGGCACT GGCTGCTAGG GATGTCGTCC 
CGCCATGGCG CGGCTCTGGG GCTTCTGCTG GCTGGTTGTG 
CGCCTGTCCC ACGTCCTGCA AA7GCAGTGC CTCTCGGATC 
TGGCATCGTG GCATTTCCGA GA7TGGAGCC TAACAGTGTA 
AATTTTCATC GCAAACCAGA AAAGGTTAGA AATCATCAAC 
TGTGGGACTG AGAAATCTGA CAATTGTGGA TTCTGGATTA 



GGATTTTGCC 
ATGAAAGGCC 
ATCTCCAATG 
GGAATGACCA 
CTCAAGCCAG 
CTAGGCGAAG 
CAGGACAAGA 



GACCTCAACA 
CCGCCCACGG 
ATGGTCTACC 



CAAACCTGCA 
CTGTGGAGGA 
ATATGTATTG 
AGGGCTCCTT 
TGGCGGAAAA 
CAACTATCAC 
TGAAAGGCAA 
CCAAATACAT 
AGCTGGATAA 
ATGGGAAGGA 
GTGCAAACCC 
TCGGGGACAC 
GTCGGGAACA 
TTTTGGTAAT 
CAGCCTCCGT 
GGAGTAACAC 
AGATCCCTGT 
ACACATTTGT 
GAGCCTTTGG 
TCTTGGTGGC 
GTGAGGCCGA 
GCGTGGAGGG 
AGTTCCTCAG 
AACTGACGCA 



ACTGACTACT 
AGCATCATGT 
TGGGAGATTT 
GAGTGTATCA 
GAGCTGATGC 
CATACCCTCC 
GGCCCTTTTC 
CATCTTTTAA 
AT CAAAGACT 
GTATTGACTT 

ACCCTTTCTT 
CTTAACAAAC 
TGCCTTGTTG 
TAAACTTTGT 
TTTATTATTA 
AACTTGTGTT 
ATTTATTATG 
GTCCCTACTT 
CCTGAGGACC 
TCCCATCACC 



ACTTGCTGGT 
ACAGGGTCGG 
ACAGGAAATT 
TCACCTATGG 
CTCAGGGCCG 
TGGGGTGCTG 
TTCAGAACTT 



GTCTAGGAAA 
TACATGCTCC 
CACTCAGGAT 
GATACCCAAT 
AGGAAAGTCT 
GGATGTTGGT 
AAGGATAACT 
TCTTGTAGGA 
ATTTCTCGAA 
CCCCAAACCA 
CTGTACTAAA 
TCCCACTCAC 
TGAGAAACAG 
AAATTATCCT 
CACGAACAGA 
TCTCTCGGTC 
GCTGTTTCTG 
TATCAGCAAT 
TCCATCTTCT 
CATTGAAAAT 
TCAGCACATC 
AAAAGTGTTC 
AGTGAAGACC 
GCTCCTGACC 
CGACCCCCTC 
GGCACACGGC 
GTCGCAGATG 
GCACTTCGTG 
GAAAATCGGG 
TGGCCACACA 
CACGACGGAA 



TGTGACATTA 



TGTGGTTTGC 
ATCACATTAT 
AACCTGGTTT 
AACATTTCAT 
GAAGATCAAG 
TCTCCAACCT 
GCGCTTCAGT 
ATACATGTTA 
ATGAACAATG 
ATTTCTGCTC 
GATGTAATTT 
AGTAATGAAA 
TATGCTGTGG 
CTTAAGTTGG 
GATGATGACT 
TCGGAAGGTG 



ACCTTGACTT 
TGTGGATCAA 
TGAATGAAAG 
CATCTGCAAA 
CCTGTAGTGT 
CCAAACATAT 
CCGATGACAG 
ATTCTGTCAA 



GACTCTCCAA 
CAGCAAGAAT 
TCTGGCCGCA 
GGCAGGTGAT 
GAATGAAACA 



CAGACCACCA C 



GGTTCTATAA 
CCAATCACAC 
GGGACTACAC 
ACTTCATGGG 
ATGAAGATTA 
TCCCTTCCAC 
TGGTGATTGC 
CAAGACACTC 
CTGCCAGCCC 



CGGGGCAATA 
GGAGTACCAC 
TCTAATAGCC 
CTGGCCTGGA 



AGACGTCACT 
GTCTGTGGTG 
CAAGTTTGGC 
ACTCCATCAC 
TGTCATTATT 



AAGCGACATA 



CTGAAGGA7G 
AACCTCCAGC 
ATCATGGTCT 



ACATTGTTCT 
GCTATAACCT CTGTCCTGAG 
TGCACGCAAG 



CTGCCGCTGG 
CCGAGAAGCT 
CTTTTTGGCA 
TAAATTTTCT 
TTGAATCAAT 
GTAATTTGTT 
TATTCCTGCC 
CACTTCTGCT 
TTACTGTTCT 
CAATCTGTGA 
AACCGCAATA 
AGGAAATACT 
TTTCTGAGGA 
AGAAATGATA 



AGTCCTGCAG 
GCAGCGAGAG 
GGCCAAGGCA 
TCCTTCCCAA 
AGGCCACCAA 
CTCGAGGGAA 
TTATCTCTTT 



CTGCATATAG 
CACCGCC-A7T 
GACTTTGGGA 
ATGCTGCCCA 
AGCGACGT CT 
TGGTACCAGC 
CGACCCCGCA 



TGCTGATGGC 



TCTCCGGTCT 

GCTGCTCTCC 
GCAGTGTGTA 

CTCTCTTTCC 



CTGGCTTCTG 
ATATCAGCAG 
TTTGATGTGG 
GTACAGATAT 
TATTGTTTTT 
AGGCTTTATC 
TGGGAGGAAC 
CAGCAACTGT 
GTAAAAAGAC 



CATTACTATT 



ATGAAAAAAA 
CGAGAGTTTC 
GGATGGCTTA 



AAAGACAACC 
TAGCTGGGAA 
TACTGGCCTC 
GAGAGCAAAG 



TGTCCCGGGA 
TTCGCTGGAT 
GGAGCCTGGG 
TGTCAAACAA 
CGTGCCCCCA 
GGAAGAACAT 
ACCTGGACAT 
AGACGGGCTG 
TTCACTCTGA 
CTTCTTCATC 
"CTCCCTTG 
TTCCCTGCTT 
AACTC7GCAT 
TTGCCCACCA 
GGGAAAACAA 
TATGGATTCA 
AGCCTGTGTA 
TAAAACCAGA 
ACTGGGATCA 
GAATGTATTC 



GAACTGCC7G 
CGTGTACAGC 
GCCTCCAGAG 
GGTCGTGT7G 
TGAGGTGA7A 
GGAGGTGTAT 
CAAGGGCATC 
TCTAGGCTAG 
AGAGGATGAA 
CAGTATTAAC 
CATAGACACA 
GTTGTTCCTT 
CACGATTCTT 



2280 
2400 



2760 
2880 



CAACTAACAA 
ATATTTCACT 
CTTCTATTTA 
TAAAAAAGAA 



ATGGCTT 



GCTGGTGTCA 
GGCACCTTCC 
ATGATTCT7T 



I 



MSSWIRWHGP 



I 



I 



I 



I 



I 



F ACPTSCKCSA SRIWCSDPSP GIVAFPRLEP 
5 I INEDDVEAY VGLRNLTIVD SGLKFVAHKA FLKNSNLQHI 
F TCSCDIMWIK TLQEAKSSPD TQDLYCLNES 
3 IPNCGLPSAN LAAPNLTVEE GKSITLSCSV AGDPVPNMYW DVGNLVSKHM 
NETSHTQGSL RITNISSDDS GKQISCVAEN LVGEDQDSVN LTVHFAPTIT FLESPTSDHH 
WCIPFTVKGN PKPALQWFYN GAILNESKYI CTKIHVTNHT EYHGCLQLDN PTHMNNGDYT 
LIAKNEYGKD EKQISAHFMG WPGIDDGANP NYPDVIYEDY GTAANDIGD7 TNRSNEIPST 
DVTDKTGREH LSVYA\An/IA SWGFCLLVM LFLLKLARHS KFGMKGPASV ISNDDDSASP 
LHHISNGSNT PSSSEGGPDA VIIGMTKIPV IENPQYFGIT NSQLKPDTFV QHIKRHNIVL 
KRELGEGAFG KVFLAECYNL CPEQDKILVA VKTLKDASDN ARKDFHREAE LLTNLQHEHI 
VKFYGVCVEG DPLIMVFEYM KHGDLNKFLR AHGPDAVLMA EGNPPTSLTQ SQMLHIAQQI 
AAGMVYLASQ HFVHRDLATR NCLVGENLLV KIGDFGMSRD VYSTDYYRVG GHTMLPIRWM 



417 



50 
55 



WO 02/086443 

P TTESDVWSLG WLWEIFTYG KQPWYQLSNN EVIECITQGR VLQRPRTCPQ 
W QREPHMRKNI KGIHTLLQNL AKASPVYLDI LG 

Seq ID NO: 596 DNA sequence 
Nucleic Acid Accession ft: AF410899 
Coding s 



CGCGCTCTAC GCGCTCAGTC CCCGGCGGTA 
GGCGTGAGGC GCCGGAGCCC GGCCTCGAGG 
GCCCCAGAGA GTCCCGGACG 



PCT/US02/12476 




AGCGGTAGCG 
CTGCCGGAAC 
CCGCAACAAG 
GTGGGGGAAA 
GGATGTCGTC 
GGCTGGTTGT 
CCTCTCGGAT 
CTAACAGTGT 
AAATCATCAA 
ATTCTGGATT 



TGTCTGAACT 
AGACTCTCCA 
GCAGCAAGAA 
ATCTGGCCGC 



CCCCCCTGTA 



CACCGAGGAG 
GCGGCCGGTG 
CTGGATAAGG 



CGCCGGGCCA 
CCGGACCAGC 



CTGGTGCAGC 



CAGCGCGGGG 
TGGCATGGAC 
AGGGCCGCTT 



CTATGCCGGG 
TCAGCCTCTG 
CGCAAGCGCA 



CCGCCATGGC 



GGGCTGGCAC T 



CGAAGATGAT 
AAAATTTGTG 
CCGAAACAAA 
GATCCTGGTG 



TGAATGAAAC 
GTGGGAAGCA 
ACCTCACTGT 



GCTGGCCTGG 



CAGACGTCAC 
CGTCTGTGGT 
CCAAGTTTGG 
GTGTTGGCCC 
TCTCCAATGG 
GAATGACCAA 
TCAAGCCAGA 
TAGGCGAAGG 
AGGACAAGAT 



TATTCCCCTG 
ACCTAACCTC 
TCCGGTTCCT 
AAGCCACACA 
GATCTCTTGT 
GCATTTTGCA 
TCCATTCACT 
ATTGAATGAG 
CGGCTGCCTC 
CAAGAATGAG 
AATTGACGAT 
AGCGAATGAC 



AACATCACCG 
GTTGAAGCTT 
GCTCATAAAG 
CTGACGAGTT 
GGCAATCCAT ■ 
TCCAGTCCAG 
GCAAACCTGC 
ACTGTGGAGG 



GGGATTTTGC 
CATGAAAGAT 
AGCCTCCGTT 
GAGTAACACT 



TGGTCTACCT 
TCGGGGAGAA 
CTGACTACTA 
GCATCATGTA 
GGGAGATTTT 
AGTGTATCAC 
AGCTGATGCT 
ATACCCTCCT 
GCCCTTTTCC 
ATCTTTTAAC 
TCAAAGACTC 
TATTGACT1C 
TTCTTTTTTT 
CCCTTTCTTT 
TTAACAAACG 
GCCTTGTTGT 
AAACTTTGTC 
TTATTATTAT 
ACTTGTGTTC 



CACATTTGTT 
AGCCTTTGGA 
CTTGGTGGCA 
TGAGGCCGAG 
CGTGGAGGGC 
GTTCCTCAGG 
ACTGACGCAG 
GGCGTCCCAG 
CTTGCTGGTG 
CAGGGTCGGT 
CAGGAAATTC 
CACCTATGGC 
TCAGGGCCGA 
GGGGTGCTGG 



CAGGGCTCCT 
GTGGCGGAAA 
CCAACTATCA 
GTGAAAGGCA 
TCCAAATACA 
CAGCTGGATA 
TATGGGAAGG 
GGTGCAAACC 
ATCGGGGACA 
GGTCGGGAAC 
CTTTTGGTAA 
TTCTCATGGT 
ATCAGCAATG 
CCATCTTCTT 
ATTGAAAATC 
CAGCACATCA 
AAAGTGTTCC 
GTGAAGACCC 
CTCCTGACCA 
GACCCCCTCA 



CTGGCATCGT 
AAATTTTCAT 
ATGTGGGACT 
CATTTCTGAA 
TGTCTAGGAA 
TTACATGCTC 
ACACT CAGG A 
AGATACCCAA 
AAGGAAAGTC 
GGGATGTTGG 
TAAGGATAAC 
ATCTTGTAGG 
CATTTCTCGA 
ACCCCAAACC 
TCTGTACTAA 
ATCCCACTCA 



CACGTCCTGC 
GGCATTTCCG 
CGCAAACCAG 



AAACAGCAAC 
ACATTTCCGT 
CTGTGACATT 
TTTGTACTGC 
TTGTGGTTTG 
TATCACATTA 



AGATTGGAGC 
AAAAGGTTAG 
ACAATTGTGG 



CACCTTGACT 
ATGTGGATCA 
CTGAATGAAA 
CCATCTGCAA 
TCCTGTAGTG 
TCCAAACATA 
TCCGATGACA 
GATTCTGTCA 
TCAGACCACC 



TAACATTTCA 
AGAAGATCAA 
ATCTCCAACC 
AGCGCTTCAG 
AATACATGTT 
CATGAACAAT 
GATTTCTGCT 
TGATGTAATT 
AAGTAATGAA 



TGCTGTTTCT GCTTAAGTTG GCAAGACACT 
TTGGATTTGG GAAAGTAAAA TCAAGACAAG 
ATGATGACTC TGCCAGCCCA CTCCATCACA 
CGGAAGGTGG CCCAGATGCT GTCATTATTG 
CCCAGTACTT TGGCATCACC AACAGTCAGC 
AGCGACATAA CATTGTTCTG AAAAGGGAGC 
CTATAACCTC TGTCCTGAGC 
CAGTGACAAT GCACGCAAGG 
GTCAAGTTCT 



TCGCAGATGC 



AAAATCGGGG 
GGCCACACAA 
ACGACGGAAA 
AAACAGCCCT 
GTCCTGCAGC 
CAGCGAGAGC 



T GTCAAACAAT 
C GTGCCCCCAG 
G GAAGAACATC 



CCAGACCGAT C 



C GTACTCCTCA 



TGAATCAATC 
TAATTTGTTA 
ATTCCTGCCT 
ACTTCTGCTG 
TACTGTTCTT 
AATCTGTGAA 



TATCTCTTTC 
TTTCTTCTTT 
TGGCTTCTGC 
TATCAGCAGA 
TTGATGTGGA 
TACAGATATC 



CAGTGTGTAC 
TCTCTTTCCA 
TTTTTCGTCT 
ATTACTATTA 
CACTCCAGTT 



GACGGGCTGA 
TCACTCTGAC 
TTCTTCATCC 
TCTCCCTTGG 
TCCCTGCTTC 
ACTCTGCATA 



GCCTTTATCT 



GAGAGTTTCT 
GATGGCTTAA 
ATGGGAGATT 



ATGGATTCAC 



T TTCTGAGGAG 



ATGGCGCATA 



AGCAACTGTT 
TAAAAAGACT 
CGTGCAGTAG 
ACACAGTTTT 
ACAGAACCTT 



GTCTTCGTAG 
TGTCAACTTC 
GAAAGCC 



AAAACCAGAG 
CTGGGATCAG 
AATGTATTCG 
GTGCCATGGA 
TGGCTTCCGT 
GTTGTGATGA 
AGTTGAAAAG 



GAGGGCAACC . 

GCCGCGGGCA 

AACTGCCTGG 



CTGATGCCGT G 

TGCATATAGC CCAGCAGATC GCCGCGGGCA 2520 
ACCGCGATTT G 
ACTTTGGGAT G 
TGCTGCCCAT T 
GCGACGTCTG G 
GGTACCAGCT G 
GACCCCGCAC G' 
CCCACATGAG G. 
ITCCGGTCTA C 



CCTCCAGAGA 
GTCGTGTTGT 
GAGGTGATAG 
GAGGTGTATG 
AAGGGCATCC 
CTAGGCTAGG 
GAGGATGAAC 
AGTATTAACA 
ATAGACACAG 



ACGATTCTTA 
GACAAAGGCC 
AACTAACAAT 

TTCTATTTAT 
AAAAAAGAAA 
AGAAAGAAGA 
CTGGTGTCAG 
GCACCTTCCC 
TGATTCTTTT 
GAGACACAAG 
TAGCACTGGT 
AGGTGGATTC 



3120 
3180 
3240 
3300 
3360 
3420 
3480 

3600 
3660 
3720 

3840 
3900 



ACPTSCKCSA SRIWCSDPSP G 

SGLKFVAHKA FLKNSNLQHI 
TLQEAKSSPD TQDLYCLNES 



418 



WO 02/086443 PCT/US02/12476 

NETSHTQGSL RITNISSDDS GKQISCVAEN LVGEDQDSVN LTVHFAPTIT FLESPTSDHH 300 

WCIPFTVKGN PKPALQWFYN GAILNESKYI CTKIHVTNHT EYHGCLQLDN PTHMNNGDYT 360 

LIAKNEYGKD EKQISAHFMG WPGIDDGANP NYPDVIYEDY GTAANDIGDT TNRSNEIPST 420 

DVTDKTGREH LSVYAWVIA SWGFCLLVM LPLLKLARHS KPGMKDFSWF GFGKVKSRQG 480 

VGPASVISND DDSASPLHHI SNGSNTPSSS EGGPDAVIIG HTKIPVIEN? QYFGITNSQL S40 

KPDTFVQHIK RHNIVLKREL GEGAFGKVFL AECYNLCPEQ DKILVAVKTL KDASDNARKD 600 

FHREAELLTN LQHEHIVKFY GVCVEGDPLI MVFEYMKHGD LNKFLRAHGP DAVLMAEGNP 660 

PTELTQSQML HIAQQIAAGM VYLASQHFVH RDLATRNCLV GENLLVKIGD FGMSRDVYST 720 

DYYRVGGHTM LPIRWMPPES IMYRKFTTES DVWSLGWLK EIFTYGKQPW YQLSNNEVIE 780 
CITQGRVLQR PRTCPQEVYE LMLGCWQREP HMRKNIKGIH TLLQNLAKAS PVYLDILG 



Coding sequence 



I I I I I 

AAAACCTTGA GGTGATTCAT CTTCCAGGCT CTCCTTCCAT CAAGTCTCTC C 
CTCTGGGTCC TTAATGGCAG CAGCCGGCGC TACCAAGATC CTTCTGTGCC T 

20 GCTCCTGCTG TCCGGCTGGT CCCGGGCTGG GCGAGCCGAC CCTCAC1 

CATCACCG1C ATCCCTAAGT TCAGACCTGG ACCACGGTGG TGTGCGGTTC AAGGCCAGGT 240 

GGATGAAAAG ACTTTTCTTC ACTATGACTG TGGCAACAAG ACAGTCACAC CTGTCAGTCC 300 

CCTGGGGAAG AAACTAAATG TCACAACGGC CTGGAAAGCA CAGAACCCAG TACTGAGAGA 3 60 

GGTGGTGGAC ATACTTACAG AGCAACTGCG TGACATTCAG CTGGAGAATT ACACACCCAA 420 

25 GGAACCCCTC ACCCTGCAGG CCAGGATGTC TTGTGAGCAG AAAGCTGAAG GACACAGCAG 480 

TGGATCTTGG CAGTTCAGTT TCGATGGGCA GATCTTCCTC CTCTTTGACT CAGAGAAGAG S4 0 

AATGTGGACA ACGGTTCATC CTGGAGCCAG AAAGATGAAA GAAAAGTGGG AGAATGACAA 600 

GGTTGTGGCC ATGTCCTTCC ATTACTTCTC AATGGGAGAC TGTATAGGAT GGCTTGAGGA 660 

CTTCTTGATG GGCATGGACA GCACCCTGGA GCCAAGTGCA GGAGCACCAC TCGCCATGTC 72 0 

30 CTCAGGCACA ACCCAACTCA GGGCCACAGC CACCACCCTC ATCCTTTGCT GCCTCCTCAT 780 

CATCCTCCCC TGCTTCATCC TCCCTGGCAT CTGAGGAGAG TCCTTTAGAG TGACAGGTTA 840 

AAGCTGATAC CAAAAGGCTC CTGTGAGCAC GGTCTTGATC AAACTCGCCC TTCTGTCTGG 90 0 

CCAGCTGCCC ACGACCTACG GTGTATGTCC AGTGGCCTCC AGCAGATCAT GATGACATCA 960 

TGGACCCAAT AGCTCATTCA CTGCCTTGAT TCCTTTTGCC AACAATTTTA CCAGCAGTTA 1020 

35 TACCTAACAT ATTATGCAAT TTTCTCTTGG TGCTACCTGA TGGAATTCCT GCACTTAAAG 1080 

TTCTGGCTGA CTAAACAAGA TATATCATTT TCTTTCTTCT CTTTTTGTTT GGAAAATCAA 1140 

GTACTTCTTT GAATGATGAT CTCTTTCTTG CAAATGATAT TGTCAGTAAA ATAATCACGT 1200 

TAGACTTCAG ACCTCTGGGG ATTCTTTCCG TGTCCTGAAA GAGAATTTTT AAATTATTTA 1260 

ATAAGAAAAA ATTTATATTA ATGATTGTTT CCTTTAGTAA TTTATTGTTC TGTACTGATA 1320 

40 TTTAAATAAA GAGTTCTATT TCCCAAAAAA AAAAAAAAAA AA 

Seq ID NO: 599 Protein sequence 



1 11 21 31 41 51 

I I I I I I 

MAAAAATKIL LCLPLLLLLS GWSRAGRADP HSLCYDITVI PKFRPGPRWC AVQGQVDEKT 
FLHYDCGNKT VTPVSPLGKK LNVTTAWKAQ NPVLREWDI LTEQLHDIQL ENYTPKEPLT 
LQARMSCEQK AEGIISSGSWQ FSFDGQIFLL FDSEKRMWTT VHPGARKMKE KWENDKWAM 
MGDC IGWLEDFLMG MDSTLEPSAG APLAMSSGTT QLRATATTLI LCOillLEC 

PILPGI 

Seq ID NO: 600 DNA sequence 

Nucleic Acid Accession #: NM_001898.1 



GCCCCAAGGA GGAGGATAGG ATAATCCCGG GTGGCATCTA TAACGCAGAC C 
AGTGGGTACA GCGTGCCCTT CACTTCGCCA TCAGCGAGTA TAACAAGGCC Ai 
ACTACTACAG ACGTCCGCTG CGGGTACTAA GAGCCAGGCA ACAGACCGTT GGGGGGGTGA 
ATTACTTCTT CGACGTAGAG GTGGGCCGCA CCATATGTAC CAAGTCCCAG CCCAACTTGG 
ACACCTGTGC CTTCCATGAA CAGCCAGAAC TGCAGAAGAA ACAG7TGTGC TCTTTCGAGA 
TCTACGAAGT TCCCTGGGAG AACAGAAGGT CCCTGGTGAA ATCCAGGTGT CAAGAATCCT 
AGGGATCTGT GCCAGGCCAT TCGCACCAGC CACCACCCAC TCCCACCCCC TGTAGTGCTC 
CCACCCCTGG ACTGGTGGCC CCCACCCTGC GGGAGGCCTC CCCATGTGCC TGCGCCAAGA 
GACAGACAGA GAAGGCTGCA GGAGTCCTTT GTTGCTCAGC AGGGCGCTCT GCCCTCCCTC 
CTTCCTTCTT GCTTCTAATA GCCCTGGTAC ATGGTACACA CCCCCCCACC TCCTGCAATT 
AAACAGTAGC ATCGCC 



MAQYLSTLLL LLATLAVALA WSPKEEDRII PGGIYNADLN DEWVQRALHF AISEYNKATK 
DDYYRRPLRV LRARQQTVGG VNYFFDVEVG RTICTKSQPN LDTCAFKEQP ELQKKQLCSF 
EIYEVPWENR RSLVKSRCQE S 

Seq ID NO: 602 DNA sequence 

Nucleic Acid Accession NM_003976.2 

Coding sequence: 2 99.961 



419 



WO 02/086443 

CTCTGAGCTT CTCTGAGCCT 
CATGGAGTTG TGAAAGAATA 
CTACTTCTGC TGGGTTGAGT 



CAGGAGGGTG 



GCCGCCGCAG C 



GGGCTGCCGC 
CGACGAGCTG 
CGACCTCAGC 



CAACAGCACC 



GGGGAACAGC 
CTTGGAGGCC 
CCCACCCTGG 
CCCCGCAGCC 
CTGCCGGGGG 

GCGGCGCGGG 
CTGCGCTCGC 
GTGCGTTTCC 
CTGGCCAGCC 
CAGCCCTGCT 



TGTTTGCTCA 
GCTGCAAAGC 
CTAGCTGTGT 
ACAAAAGATA 



TCTGGAAAAA GGGGATTAAA CCATTTACCT 
ACCTAACACA TAGTAAGGTT CCCAGTGCAG 
AGGCCCCTTG TTCCTCACCT GGAGAAACTG 
ACTCATCTCT TAATTTGCAA GCTGCCTCAA 
CTGATGGGCG CTCCTGGTGT TGATAGAGAT 
TCTCCACGCT GTCCCACTGC CCCTGGCCTA GGCGGCAGCC 
CCGCTCTGGC TCTGCTGAGC AGCGTCGCAG AGGCCTCCCT 
CTGCCCCCCC- CGAAGGCCCC CCGCCTGTCC TCCCGTCCCC 
GACGCACGGC CCGCTGGTGC AGTGGAAGAG CCCGGCGGCC 
CCGCGCCCCC GCCGCCTGCA CCCCCATCTG CTCTTCCCCG 
CTGGGGGCCC GGGCAGCCGC GCTCGCGCAC CGGGGGCGCG 
AGCTGGTGCC GGTGCGCGCG CTCGGCCTGG GCCACCGCTC 
GCTTCTGCAG CGGCTCCTGC CGCCGCGCGC GCTCTCCACA 
TACTGGGCGC CGGGGCCCTG CGACCGCCCC CGGGCTCCCG 
GCCGACCCAC G( 



PCT/US02/12476 



CAGCCCCAGA GCCCTCACCC 
GGAGCCCTTC GGACCCACTT 
CCCTCCTCTG ATGAACACTA 
ACAGCATTTG AAGGACACAT 
CCTGTACTCA CTCATGGGAG 

Seq ID NO: 603 Probe: 



TGGATATCAT C 



CCAGGGCTTT GCAGACTGGA CCCTTACCGG 
TCAGCCAGGG 
CCCCGAACAG 
AGCCTAAAAG 
CTGGCACTGG 
GGCATCAGCC 
CTTGGTTGAA 



CTCACAGACT C 
CAGTGGCTGA G 
ATTGCAGTTG C 
CTGGCCCC 



GCCTGCGGCT GCCTGGGCTG 
TGGCTCTTCC TGCCTGGGAC 
ACGAAGGCCT CAAAGCTGAG 
GTGAAGGGAC AACTGACTAG 
ACACCAGAGA CCTCAGCTAT 
CCAGGCCTCG AACCTGGGAC 
CCCGCCCAGG CCCTGTAGGG 
AGTGCCTGTG CTGGAACTGG 



MELGLGGLST LSHCPWPRRQ PALWPTLAAL ALLSSVAEAS LGSAPRSPAP REGPPPVLAS 
PAGHLPGGRT ARWCSGRARR PPPQPSRPAP PPPAPPSALP RGGRAARAGG PGSRARAAGA 
RGCRLRSQLV PVRALGLGHR SDELVRFRPC SGSCRRARSP HDLSLASLLG AGALRPPPGS 
RPVSQPCCRP TRYEAVSFMD VNSTWRTVDR LSATACGCLG 



Coding sequence 



CGCGTGTCTA 



GAGAGAAGAA 
ATCTGCACGT 

CCCTCACTCA 



I 



I 



I 



I 



CGGGGCAGGG 
CACCGGACGG 
CAGACAAGGC 



TAAAAGAGGC 



GAGGGGCCCC 
GGGACTGGAT 
GCGCTCCCAG 
CTGCGGCGGC 
CCGGGGGCTC 
GAGCCCCACA 
ACTGCCAGGT 
GGTCCCCGGA 



CAGAGAGCAG CTGCTGCAGG GCAGACAGCC 
AGCCGCCCCA CGCAGGGACC GGCTTACCCC 
CCCTCGGCCC GGCCTCCCAG CTCTCTACTT 
CGTGCCTCTC CACCGCTCGA GTTCTCTACT 
TCCCAGCATC TACCCCCCTC CCAACCTCGG GGGACCTAGC 
AAAGGTGGGG 



GGGCAGGAGG 
CCCGAGGGTG 
AAGGTGCCTA 



CAGACTGGCT 
GGGCATGCGC 
GAAGAACAAG 



GGCCCCAGCC CTCGCTGCCA 540 



T TGGACTTGGA G 



CCCTGGGCTC CGCGCCCCGC 
CCCCCGCCGG CCACCTGCCG 
GGCCGCCGCC GCAGCCTTCT 



TGTTTGAGCT 
GTGCAGGACC 
GGCGCTCCTG 
CTGCCCCTGG 
GAGCAGCGTC 



GGGGGACGCA C 



CCTAGGCGGC 
GCAGAGGCCT 
GTCCTGGCGT 
AGAGCCCGGC : 



CACACGACCT CAGCCTGGCC AGCCTACTGG G 



Z TGCTGCCGAC CCACGCGCTA CGAAGCGGTC TCCTTCATGG 
C CACO 



GGACCCTCCC 



CTAGCAGCCC 
CTATGGAGCC 
GGACCCCTCC 



CTGGCCTGTA 



GCAGAGTCCC 
CTACCGGTGG 
CAGAGCCCTC 
CTTCGGACCC 
TCTGATGAAC 
TTTGAAGGAC 
CTCACTCATG 



CTTTGCAGAC TGGACCCTTA 
ACTAGCCAGC GGCCTCAGCC 
GTGATGGATA TCATCCCCGA 
ACCCTGCGGA TCCCAGCCTA 
ACTTCTCACA GACTCTGGCA 
ACTACAGTGG CTGAGGCATC 



GGAGCTGGCC CC 



AGGGACGAAG 
ACAGGTGAAG 
AAAGACACCA 
CTGGCCAGGC 
AGCCCCCGCC 
TGAAAGTGCC 



TTCCTGCCTG 
GCCTCAAAGC 
GGACAACTGA 
GAGACCTCAG 
CTCGAACCTG 
CAGGCCCTGT 
TGTGCTGGAA 



Seq ID NO: 605 Protein sequence 



1200 
1260 
1320 
1380 
1440 
1500 

1620 
1680 
1740 
1800 
1860 



MELGLGGLST LSHCPWPRRQ PALWPTLAAL ALLSSVAEAS LGSAPRSPAP REGPPPVLAS 

PAGHLPGGRT ARWCSGRARR PPPQPSRPAP PPPAPPSALP RGGRAARAGG PGSRARAAGA 

RGCRLRSQLV PVRALGLGHR SDELVRFRFC SGSCRRARSP HDLSLASLLG AGALRPPPGS 

RPVSQPCCRP TRYEAVSFMD VNSTWRTVDR LSATACGCLG 



420 
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-I I I I I I 

J ATGCCCGGCC TGATCTCAGC CCGAGGACAG CCCCTCCTTG AGGTCCTTCC TCCCCAAGCC 60 

CACCTGGGTG CCCTCTTTCT CCCTGAGGCT CCACTTGGTC TCTCCGCGCA GCCTGCCCTG 120 

TGGCCCACCC TGGCCGCTCT GGCTCTGCTG AGCAGCG7CG CAGAGGCCTC CCTGGGCTCC 180 

GCGCCCCGCA GCCCTGCCCC CCGCGAAGGC CCCCCGCCTG TCCTGGCGTC CCCCGCCGGC 240 

CACCTGCCGG GGGGACGCAC GGCCCGCTGG TGCAGTGGAA GAGCCCGGCG GCCGCCGCCG 300 

10 CAGCCTTCTC GGCCCGCGCC CCCGCCGCCT GCACCCCCAT CTGCTCTTCC CCGCGGGGGC 3 SO 

CGCGCGGCGC GGGCTGGGGG CCCGGGCAGC CGCGCTCGGG CAGCGGGGGC GCGGGGCTGC 420 

CGCCTGCGCT CGCAGCTGGT GCCGGTGCGC GCGCTCGGCC TGGGCCACCG CTCCGACGAG 480 

CTGGTGCGTT TCCGCTTCTG CAGC ( 3 TCTCC ACACGACCTC 540 

AGCCTGGCCA GCCTACTGGG CGCCGGGGCC CTGCGACCGC CCCCGGGCTC CCGGCCCGTC 600 

15 AGCCAGCCCT GCTGCCGACC CACGCGCTAC GAAGCGGTCT CCTTCATGGA CGTCAACAGC S60 

ACCTGGAGAA CCGTGGACCG CCTCTCCGCC ACCGCCTGCG GCTGCCTGGG CTGAGGGCTC 720 

GCTCCAGGGC TTTGCAGACT GGACCCTTAC CGGTGGCTCT TCCTGCCTGG GACCCTCCCG 780 

CAGAGTCCCA CTAGCCAGCG GCCTCAGCCA GGGACGAAGG CCTCAAAGCT GAGAGGCCCC 840 

TACCGGTGGG TGATGGATAT CATCCCCGAA CAGGTGAAGG GACAACTGAC TAGCAGCCCC 900 

20 AGAGCCCTCA CCCTGCGGAT CCCAGCCTAA AAGACACCAG AGACCTCAGC TATGGAGCCC 960 

TTCGGACCCA CTTCTCACAG ACTCTGGCAC TGGCCAGGCC TCGAACCTGG GACCCCTCCT 1020 

CTGATGAACA CTACAGTGGC TGAGGCATCA GCCCCCGCCC AGGCCCTGTA GGGACAGCAT 1080 

TTGAAGGACA CATATTGCAG TTGCTTGGTT GAAAGTGCCT GTGCTGGAAC TGGCCTGTAC 114 0 
TCACTCATGG GAGCTGGCCC C 



30 
35 



MPGLISARGQ PLLEVLPPQA HLGALFLPEA PLGLSAQPAL WPTLAALALL 3SVAEASLGS 
APRSPAPREG PPPVLASPAG HLPGGRTARW CSGRARRPPP QPSRPAPPPP APPSALPRGG 
RAARAGGPGS RARAAGARGC RLRSQLVPVR ALGLGHRSDE LVRFRPCSGS CRRARSPHDL 
SLASLLGAGA LRPPPGSRPV SQPCCRPTRY EAVSPMDVNS TWRTVDRLSA TACGCLG 

Seq ID NO: 608 DNA sequence 
Nucleic Acid Accession #: NM_0S7090.1 
Coding sequence: 29.. 715 

40 i 

I I I I I I 

CTGATGGGCG CTCCTGGTGT TGATAGAGAT GGAACTTGGA CTTGGAGGCC TCTCCACGCT 
GTCCCACTGC CCCTGGCCTA GGCGGCAGGC TCCACTTGGT CTCTCCGCGC AGCCTGCCCT 
GTGGCCCACC CTGGCCGCTC TGGCTCTGCT GAGCAGCGTC GCAGAGGCCT CCCTGGGCTC 
45 CGCGCCCCGC AGCCCTGCCC CCCGCGAAGG CCCCCCGCCT GT CCTGGCGT CCCCCGCCGG 
CCACCTGCCG GGGGGACGCA CGGCCCGCTG GTGCAGTGGA AGAGCCCGGC GGCCGCCGCC 
GCAGCCTTCT CGGCCCGCGC CCCCGCCGCC TGCACCCCCA TCTGCTCTTC CCCGCGGGGG 
CCGCGCGGCG CGGGCTGGGG GCCCGGGCAG CCGCGCTCGG GCAGCGGGGG CGCGGGGCTG 
CCGCCTGCGC TCGCAGCTGG TGCCGGTGCG CGCGCTCGGC CTGGGCCACC GCTCCGACGA 
50 GCTGGTGCGT TTCCGCTTCT GCAGCGGCTC CTGCCGCCGC GCGCGCTCTC CACACGACCT 
CAGCCTGGCC AGCCTACTGG GCGCCGGGGC CCTGCGACCG CCCCCGGGCT CCCGGCCCGT 
; TGCTGCCGAC CCACGCGCTA CGAAGCGGTC TCCTTCATGG ACGTCAACAG 
3TGGACC GCCTCTCCGC CACCGCCTGC GGCTGCCTGG GCTGAGGGCT 

55 GCAGAGTCCC ACTAGCCAGC GGCCTCAGCC AGGGACGAAG GCCTCAAAGC T( 

CTACCGGTGG GTGATGGATA TCATCCCCGA ACAGGTGAAG GGACAACTGA CTAGCAGCCC 900 

CAGAGCCCTC ACCCTGCGGA TCCCAGCCTA AAAGACACCA GAGACCTCAG CTATGGAGCC 960 

CTTCGGACCC ACTTCTCACA GACTCTGGCA CTGGCCAGGC CTCGAACCTG GGACCCCTCC 1020 

TCTGATGAAC ACTACAGTGG CTGAGGCATC AGCCCCCGCC CAGGCCCTGT AGGGACAGCA 1080 

60 TTTGAAGGAC ACATATTGCA GTTGCTTGGT TGAAAGTGCC TGTGCTGGAA CTGGCCTGTA 1140 
CTCACTCATG GGAGCTGGCC CC 

Protein Accession # 



MELGLGGLST LSHCPWPRRQ APLGLSAQPA LWPTLAALAL LSSVAEASLG 3APR3PAPRE 
GPPPVLASPA GHLPGGRTAR WCSGRARRPP PQPSRPAPPP PAPPSALPRG GRAARAGGPG 
SRARAAGARG CRLRSQLVPV RALGLGHRSD ELVRFRFCSG SCRRARSPHD LSLASLLGAG 
ALRPPPGSRP VSQPCCRPTR YEAVSFMDVN STWRTVDRLS ATACGCLG 

Seq ID NO: 610 DNA sequence 

Coding sequence: 1..1746 

1 11 21 31 41 51 

ATGCCACTGA AGCATTATCT CCTTTTGCTG GTGGGCTGCC AAGCCTGGGG TGCAGGGTTG 
GCCTACCATG GCTGCCCTAG CGAGTGTACC TGCTCCAGGG CCTCCCAGGT GGAGTGCACC 
GGGGCACGCA TTGTGGCGGT GCCCACCCCT CTGCCCTGGA A 
CTCAACACGC ACATCACTGA ACTCAATGAG T 
GCCCTGAGGA TTGAGAAGAA TGAGCTGTCG CGCATCACGC C 
GGCTCGCTGC GCTATCTCAG CCTCGCCAAC AACAAGCTGC 
TTCCAGGGCC TGGACAGCCT TGAGTCTCTC CTTCTGTCCA GTAACCAGCT GTTGCAGATC 
CAGCCGGCCC ACTTCTCCCA GTGCAGCAAC CTCAAGGAGC TGCAGTTGCA CGGCAACCAC 
CTGGAATACA TCCCTGACGG AGCCTTCGAC CACCTGGTAG GACTCACGAA GCTCAATCTG 
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GGCAAGAATA GCCTCACCCA CATCTCACCC AGGGTCTTCC AGCACCTGGG CAATCTCCAG 600 

GTCCTCCGGC TGTATGAGAA CAGGCTCACG GATATCCCCA TGGGCACTTT TGATGGGCTT 660 

GTTAACCTGC AGGAACTGGC TCTACAGCAG AACCAGATTG GACTGCTCTC CCCTGGTCTC 720 

TTCCACAACA ACCACAACCT CCAGAGACTC TACCTGTCCA ACAACCACAT CTCCCAGCTG 780 

5 CCACCCAGCA TCTTCATGCA GCTGCCCCAG CTCAACCGTC TTACTCTCTT TGGGAATTCC 840 

CTGAAGGAGC TCTCTCTGGG GATCTTCGGG CCCATGCCCA ACCTGCGGGA GCTTTGGCTC 900 

TATGACAACC ACATCTCTTC TCTACCCGAC AATGTCTTCA GCAACCTCCG CCAGTTGCAG 960 

GTCCTGATTC TTAGCCGCAA TCAGATCAGC TTCATCTCCC CGGGTGCCTT CAACGGGCTA 1020 

ACGGAGCTTC GGGAGCTGTC CCTCCACACC AACGCACTGC AGGACCTGGA CGGGAATGTC 1080 

10 TTCCGCATGT TGGCCAACCT GCAGAACATC TCCCTGCAGA ACAATCGCCT CAGACAGCTC 1140 

CCAGGGAATA TCTTCGCCAA CGTCAATGGC CTCATGGCCA TCCAGCTGCA GAACAACCAG 1200 

CTGGAGAACT TGCCCCTCGG CATCTTCGAT CACCTGGGGA AACTGTGTGA GCTGCGGCTG 12 60 

TATGACAATC CCTGGAGGTG TGACTCAGAC ATCCTTCCGC TCCGCAACTG GCTCCTGCTC 1320 

AACCAGCCTA GGTTAGGGAC GGACACTGTA CCTGTGTGTT TCAGCCCAGC CAATGTCCGA 1380 

15 GGCCAGTCCC TCATTATCAT CAATGTCAAC GTTGCTGTTC CAAGCGTCCA TGTCCCTGAG 1440 

GTGCCTAGTT ACCCAGAAAC ACCATGGTAC CCAGACACAC CCAGTTACCC TGACACCACA 1500 

TCCGTCTCTT CTACCACTGA GCTAACCAGC CCTGTGGAAG ACTACACTGA TCTGACTACC 1S60 

ATTCAGGTCA CTGATGACCG CAGCGTTTGG GGCATGACCC AGGCCCAGAG CGGGCTGGCC 1620 

ATTGCCGCCA TTGTAATTGG CATTGTCGCC CTGGCCTGCT CCCTGGCTGC CTGCGTCGGC 16S0 

20 TGTTGCTGCT GCAAGAAGAG GAGCCAAGCT GTCCTGATGC AGATGAAGGC ACCCAATGAG 1740 

TGTTAAAGAG GCAGGC1GGA GCAGGGCTGG GGAATGATGG GACTGGAGGA CCTGGGAATT 1B00 

TCATCTTTCT GCCTCCACCC CTGGGTCCAT GGAGCTTTCC CGTGA7TGCT CTTTCTGGCC 1860 

CTAGATAAAG GTGTGCCTAC CTCTTCCTGA CTTGCCTGAT TCTCCCGTAG AGAAGCAGGT 1920 

CGTGCCGGAC CTTCCTACAA TCAGGAAGAT AGAT CCAACT GGCCATGGCA AAAGCCCTGG 1980 

25 GGATTTCCGA TTCATACCCC TGGGCTTCCT TCGAGAGGGC TCTTCCTCCA AATCCTCCCC 2 040 

ACCTGTCCTC CAAGAACAGC CTTCCCTGCG CCCAGGCCCC CTCCGGGCCT CTGTAGACTC 2100 

AGTTAGTCCA CAGCCTGCTC ACTTCGTGGG AATAGTTCTC CGCTGAGATA GCCCCTCTCG 2160 

CCTAAGTATT ATGTAAGTTG ATTTCCCTTC TTTTGTTTCT CTTGTTTGTG CTATGGCTTG 2220 

ACCCAGCATG TCCCCTCAAA TGAAAGTTCT CCCCTTGATT TTCTGCTCCT GAAGGCAGGG 2 2 80 

30 TGAGTTCTCT CCTCAAAGAA GACTT CAAAC CATTTAACTG GTTTCTTAAG AGCCGTCAAT 2340 

CAGCCTGGTT TTGGGGATGC TATGAAAGAG AGAAGGAAAA TCATGCCGCT CAGTTCCTGG 2 400 

AGACAGAAGA GCCGTCATCA GTGTCTCACT TGTGATTTTT ATCTGGAAAA GGAAGAAACA 2 460 

CCCCAGCACA GCAAGCTCAG CCTTTTAGAG AAGGATATTT CCAAACTGCA AACTTTGCTT 2S2 0 

TGAAAAGTTT AGCCCTTTAA GGAATGAAAT CATGTAGAAT TTTGGACTTC TAAAAACATT 2 580 

35 AAAAT CAGCT TATTAATACG GGATAGAGAA AGAAATCTGG TGCCTGGGGG TCCCTGTGTT 2640 

CACCCCTAGA GTTTGTTTTA AAATTTTTAA TTGAAGCATG TGAAGTGTAC STGCAGAAAA 2700 

GTGGGAACAT GATAGTGTAT GGCTTGGTGG ATTTTCACAA ACTGAACATA CCTGTGTAAT 2760 

CAGCATCTAG ACCCAGACCC AGAGCATCAC AAATATCCCC CATCCTGGGC TTTTCCCAGA 2820 

GGAGATGGGG GCTTCTGAAG ATGGACTTAC CTGGGACCTG CCCCCCATGA GCCAGGACGG 2 88 0 

40 TCCCCCCACA GTCAGCCTGT GCAAAGGCCC CGTGGCCAGG GGTGGAGGAG AATATGTGGG 2 940 

TGTGGACAGG ATGGGAGACT GTGGCCTGAA CAGGAGATTT TATTATATCT GGAGACCCTG 3000 

AGAGACCCTG AGACCTGGGG CACCATGGCT GGCCAGGTCA GAAGCATCCT GACTGCAGAG 3 060 

GTCCGTGCAG CCACACCCTC TTCCCTGCCA GCAAGTTGTC TGCGGCTCAT CGGAGGCCCC 312 0 

TCCGCCTGGA GCCTTCTATG GACGTGATAT GCCTGTATCT GTTTTTAATT TTCATTCTTC 3180 

45 ACTTAGGGGA AGTGAAATCG CT CAGAG ATG AGATCCTTTA ATTGAAAACG AAGTGTAACG 3 240 

GAATCTAGTG TCTTTCTAAT GTGGT AAAAT TCTCCATCAA CATCACAGTC AGCTGGCAGC 3300 

TGAACTTCAG AATCTCACTT ACAGCAGGCG ACACGGGGGT ACACCGATGG GTCACACTGG 3360 

GTCTGGGGGC TCCCTGGAGC TCCTCCTGCG TGTGGTCTGG TTAGGAGTTG AGTTGTTTGC 3 42 0 

TCCAGGGTTA TTCTCCTCCT CGAGTCACAG TCACACGAAT ACCTGCCTTC TCTGGCTTTC 3 48 0 

50 CTGCTATACA CATATTCACA TGGCGCTCAA GAAGTTAGGC TCATGGCAAC GTGTGTCTTT 3540 

CTCTGGACAA CTGGCCCAGT TTACAGTGAA ATGGAGAATT TCAGGTCTCC ACGTCTGCCC 3600 

AGGAAAGAAC TTCAGCTGAC TCCACGGGGA TCTGGAAATC CACGACCAAT CCCGATCGGC 3 660 

TCTTATTAGC TCCCCGCTCC ACAAGACACC TGTGCTTTGG AAATCCACCA CCAATCCCGA 3720 

TCGSCTCTTA TTAGCTCCCC GCTCCACAAG ACACCTGTGA TCTGGAAATC TACCACCAAT 3780 

55 CCCGATCGGC TCTTATTAGC TCCCCGCTCC ACAAGACACC TGTGACATCC TCCAGGGCCA 3840 

CAGGAGCACG TGCTGACCAG TTTTCCCTTC CAGTTCCTGC ACAAAAAGTG TCCAGAGGGC 3900 

TGTTTGCAAA CACTAGTGCA CTTTGTAGCT TTTCACCCTC TGTCCCAGGG AATCTAGGAG 3 960 

AGATGAGGCC CGTCAGAGTC AAGAGATGTC ATCCCCCCAG GGTCTCCAAG GCATTTCCAC 4020 

ACTATTGGTG GCACCTGGAG GACATGCACC AAGGCTTGCC AGAGCCAACA GGAAGTGAGC 4080 

60 CCAGAGCATG GCACATGAGC ATCACCCGCT GATGGTGGCC TGCTGTGCCT GGTGCCAACA 4140 

GGGGCATCCC GGCCCGTACC CCTCCAGACA GGAAGCATGG GTTTGCCCAC AGACCTGTCG 4200 

GGTGCTCCTG TGAGTGGCCT CCAGATGTCT TTGTGCATAG GCACAAGTGG GCCAGGGCTG 4260 

GAGGGAGGTG GGAAACCTCA TCATCCGGTG GGCCCTGCCA ATCTTAACCC AGAACCCTTA 4320 

GGTATTCCTG GCAGTAGCCA TGACATTGGA GCACCTTCCT CTCCAGCCAG AGGCTGACCT 4 380 

65 GAGGGCCACT GTCCTCAGAT GACACCACCC AGGAGCACCC TAGGTGAGGG GTGAGGGCCC 4440 

CCTTATGTGA ACCTCTTGCC TCTTCCTTTC TCCCATCAGA GTGGTTGGAT GGAGCCATTG 4500 

GCCTCCTTTT CTTCAGCGGG CCCTTCAACC TCTCTGCACC ATGTTGTCTG GCTGAGGAGC 4560 

TACTAGAAAA GCTGAGTGGA GTCTCCTTTC CAACAGGATG ATGCATTTGC TCAATTCTCA 462 0 

GGGCTGGAAT GAGCCGGCTG GTCCCCCAGA AAGCTGGAGT GGGGTACAGA GTTCAGTTTT 4680 

70 CCTCTCTGTT TACAGCTCCT TGACAGTCCC ACGCCCATCT GGAGTGGGAG CTGGGAGTTA 4740 

GTGTTGGAGA AGAAACAACA AAAGCCAATT AGAACCACTA TTTTTAAAAA GTGCTTACTG 4800 

TGCACAGATA CTCTTCAAGC ACTGGACGTG GATTCTCTCT CTAGCCCTCA GCACCCCTGC 4860 

GGTAGGAGTG CCGCCTCTAC CCACTTGTGA TGGGGTACAG AGGCACTTGC TCTTCTGCAT 492 0 

GGTGTTCAAT AGGCTGGGAG TTTTATTTAT CTCTTCAAAC TTTGTACMG AGCTCATGGC 4980 

75 TTGTCTTGGG CTTTCGTCAT TAAACCAAAG GAAATGGAAG CCATTCCCCT GTTGCTCTCC 5040 

TTAGTCTTGG TCATCAGAAC CTCACTTGGT ACCATATAGA TCAAAAGCTT TGTAACCACA 5100 

GGAAAAAATA AACTCTTCCA TCCCTTAAAG AATAGAATAG TTTGTCCCTC TCATGGGAAT 5160 

TGGGCTGTAT GTATATTGTT CTTCCTCCTT AGAATTTAGA GATACAAGAG TTCTACTTAG 5220 

AACTTTTCAT GGACACAATT TCCACAACCT TTCAGATGCT GATGTAGAGC TATTGGGAAA 5280 

80 GAACTTCCAA ACTCAGGAAG TTTGCAGAGA GCAGACAGCT AGAGATAACT CGGGACCCAG 534 0 

AGTTGGTCGA CAGATGTTAG ATGTATCCTA GCTTTTAGCC ATAAACCACT CAAAGATTCA 5400 

GCCCCCAGAT CCCACAGTCA GAACTGAATC TGCGTTGTTG GGAAGCCAGC AGTGGCCTTG 5460 

GGAAGGAAGC CATGGCTGTG GTTCAGAGAG GGTGGGCTGG CAAGCCACTT CCGGGGAAAA 552 0 

CTCCTTCCGC CCCAGGTTTC TTCTTCTCTT AAGGAGAGAT TGTTCTCACC AACCCGCTGC 5580 

85 CTTCATGCTG CCTTCAAAGC TAG AT C ATGT TTGCCTTGCT TAGAGAATTA CTGCAAATCA 5640 

GCCCCAGTGC TTGGCGATGC ATTTACAGAT TTCTAGGCCC TCAGGGTTTT GTAGAGTGTG 570 0 

AGCCCTGGTG GGCAGGGTTG GGGGGTCTGT CTTCTGCTGG ATGCTGCTTG TAATCCATTT 5760 
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I 



I 



I 



L VGCQAWGAGL AYHGCPSECT CSRASQVECT GAHIVAVPTP LPWNAMSLQI 
jNISALI ALRIEKNELS RITPGAFRNL GSLRYLSLAN NKLQVLPIGL 
FQGLDSLESL LLSSNQLLQI QPAHFSQCSN LKELQLHGNH LEYIPDGAFD HLVGLTKLNL 
GKNSLTHISP RVFQHLGNLQ VLRLYENRLT DIPMGTFDGL VNLQELALQQ NQIGLLSPGL 
FHNNHNLQRL YLSNNHISQL PPSIFMQLPQ LNRLTLFGNS LKSLSLGIFG PMPNLRELWL 
YDNHISSLPD NVFSNLRQLQ VLILSRNQIS FISPGAFNGL TELREL3LHT NALQDLDGNV 
FRMLANLQNI SLQNNRLRQL PGNIFANVNG LMAIQLQNNQ LENLPLGIFD HLGKLCELRL 
YDNPWRCDSD ILPLRNKLLL NQPRLGTDTV PVCFSPANVR GQ3LIIINVN VAVPSVHVPE 
VPSYPETPWY PDTPSYPDTT SVSSTTELTS PVEDYTDLTT IQVTDDRSVW GMTQAQSGLA 
IAAIVIGIVA LACSLAACVG CCCCKKRSQA VLMQMKAPNE C 

Seq ID NO: 612 DNA sequence 



11 



I 



I 



ATGATGCATT TGCTCAATTC TCAGGGCTGG AATGAGCCGG CTGGTCCCCC AGAAAGCTGG 
AGTGGGGTAC AGAGTTCAGT TTTCCTCTCT GTTTACAGCT CCTTGACAGT CCCACGCCCA 
TCTGGAGTGG GAGCTGGGAG TCAGTGTTGG AGAAGAAACA ACAAAAGCCA ATTAGAACCA 
CTATTTTTAA AAAGTGCTTA CTGTGCACAG ATACTCTTCA AGCACTGGAC GTGGATTCTC 
TCTCTAGCCC TCAGCACCCC TGCGGTAGGA GTGCCGCCTC TACCCACTTG TGATGGGGTA 
CAGAGGCACT TGCTCTTCTG CATGGTGTTC AATAGGCTGG GAGTTTTATT TATCTCTTCA 
AACTTTGTAC AAGAGCTCAT GGCTTGTCTT GGGCTTTCGT CATTAAACCA AAGGAAATGG 
AAGCCATTCC CCTGTTGCTC TCCTTAG 

Seq ID NO: 613 Protein sequence 



MMIILLNSQGW NEPAGPPESW SGVQSSVFLS VYSSLTVPRP SGVGAG3QCW RRNNKSQLEP 
LFLKSAYCAQ ILFKHWTWIL SLALSTPAVG VPPLPTCDGV QRHLLFCMVF NRLGVLFISS 
NFVQELMACL GLSSLNQRKW KPFPCCSP 

Seq ID NO: 614 DNA sequence 

Nucleic Acid Accession #: NM_00265a.i 

Coding sequence: 77.. 1372 



TGGAGGAACA 



CGCCGTCGCG C 
GCCACCATGA G 
AAAGGCAGCA A 



TCACTTTTAC 
CTCTGCCACT 
CCTGGGGAAA 



A ATTTCAGTGT 



TCTGATGCTC 
AGGCGACCCT 
CATGACTGCG 



CATCTACAGG AGGCACCGGG G 



CATCGTCTAC 
GGTGGAAAAC 
CATTGCCTTG 
ACAGACCATC 
CACTGGCTTT 
TGTTGTGAAG 
CACCACCAAA 



TGTCTTTTTC 
GGCTCGAAGG 
AATGAATAAT 
AATGTGGGAG 
ATTCCATGAA 
GCTGTGAGTG 



CTGGGTCGCT 
CTCATCCTAC 
CTGAAGATCC 
TGCCTGCCCT 
GGAAAAGAGA 
CTGATTTCCC 
ATGCTATGTG 
CCCCTCGTCT 
GGATGTGCCC 
ATCCGCAGTC 
CGGGCACCAC 
GTAAGAAGAG 
GTGAACGACA 
GGCCAGGATG 



CCACACACTG 
CAAGGCTTAA 
ACAAGGACTA 
GTTCCAAGGA 
CGATGTATAA 
ATTCTACCGA 



CACCTACGTG 
CTTCATTGAT 
CTCCAACACG 
CAGCGCTGAC 



AACCAGCCCT 



TACCCAAAGA 
CAAGGGGAGA 
ACGCTTGCTC 



CTGCTGACCC 
GTTCCCTCCA 
TGAAGGACAA 
ACACCAAGGA 
CCGCTTTCTT 
ACTGGGAAGA 
ATAGCTTTAC 
GAGGGGTGGT 



CCAATGGAAA 



\ GGAAGTGTAA 



TGTATCAGGA 
TAAGTGTGAG 
GACTGTGATG 
TTGGGTCCCC 
AGCACTGTCT 



GCCAGGCGTC 
AGAGAATGGC 
GCTGGTTGTC 
TAGGCTCTGC 
CCTCACGGAT 
CCTGACTCAA 
TTAAAAAGGG 
GGTGGGCATT 
GCAGCTGAGG 
ACACTAACGA 



TTCAGCTGGG 
GGTGCTATGT 
CAGATGGAAA 
CTCTGAGGCC 
GGTTTGCGGC 
GCCTCATCAG 
AGGAGGACTA 
TGAAGTTTGA 
ACCACAACGA 
CCCGGACTAT 
GCTGTGAGAT 
TGAAAATGAC 

CACTACTACG 
ACAGATTCCT 
ACTTTGACTG 
TACACGAGAG 
CTGGCCCTCT 
ATTTTTGCAG TAGAGTCATC 
ACAGATGGAT TTGCCTGTGG 



G TTTGGCACAA 



CATGTTACTG A 



TAAGAGCTGG 
CCACACAGAG 
CACGTGACAG 
CAGTTTCACT 
TTCATCCAAT 



TGTCTGATTG 
TGCCTGGGAA 



TGTGAGGCCC ATGGTTGAGA 
TCTCTTGAGG GAGCTTAGCC 
CTTCAGGGCA GGGCTCTGAT 
TTTGCACACT TGTTGTGT3G 
TTAAGTCTAA ATATTTCCTT 
GGAGAGGTTA TAGGTCACTC 



\ TGTGATTTTT CTGA 
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21 31 41 51 

I I I I 

L CVLWSDSKG SNELHQVPSN CDCLNGGTCV SNKYF3NIHW NCPKKKG Q 
HCEIDKSKTC YEGNGHFYRG KASTDTMGRP CLPWNSATVL QQTYHAHRSD ALQLGLGKHN 
YCRNPDNRRR PWCYVQVGLK PLVQECMVHD CADGKKPSSP PEELKFQCGQ KTLRPRFXII 
GGEFTTIENQ PWFAAIYRRH RGGSVTYVCG GSLISPCWVI SATHCFIDYP KKEDYIVYLG 
RSRLNSNTQG EMKFEVENLI LHKDYSADTL AHHNDIALLK IRSKEGRCAQ PSRTIQTICL 
PSMYNDPQFG TSCEITGFGK ENSTDYLYPE QLKMTWKLI SHRECQQPHY YGSEVTTKML 
CAADPQWKTD SCQGDSGGPL VCSLQGRMTL TGIVSWGRGC ALKDKPGVYT RVSHFLPWIR 
SHTKEENGLA L 

Seq ID NO: SIS DNA sequence 



1 11 21 31 41 51 

I I I I I I 

CGCCAAAGGA AAAGCCCCTT GGATGAGAGG CAGGCGCTTC AGAGAAGCTA AG AAAAGCAC 60 

CTCTCCGCGC GCCCCACCTC CTCCGCCTCG CGCTCCTCCT GAGCAGCGGG CCCAGACTGC 12 0 

GCTCCGGCCG CGGCCCTCGC CCCGCGGAGC CCTCCTACCC CGGCCCGACG CTCGGCCCGC 180 

GACCTGCCCC GAGCCCTCTC CATGGAGGCA GCCCGCCCCT CCGGCTCCTG GAACGGAGCC 240 

CTCTGCCGGC TGCTCCTGCT GACCCTCGCG ATCTTAATAT TTGCCAGTGA TGCCTGCAAA 300 

AATGTGACAT TACATGTTCC CTCCAAACTA GATGCCGAGA AACTTGTTGG TAGAG7TAAC 3 SO 

CTGAAAGAGT GCTTTACAGC TGCAAATCTA ATTCATTCAA GTGATCCTGA CTTCCAAATT 42 0 

TTGGAGGATG GTTCAGTCTA TACAACAAAT ACTATTCTAT TGTCCTCGGA GAAGAGAAGT 4 80 

TTTACCATAT TACTTTCCAA CACTGAGAAC CAAGAAAAGA AGAAAATATT TGTCT7TTTG 540 

GAGCATCAAA CAAAGGTCCT AAAGAAAAGA CATACTAAAG AAAAAGTTCT AAGGCGCGCC 600 

AAGAGAAGAT GGGCTCCAAT TCCTTGTT CG ATGCTAGAAA ACTCCTTGGG TCCTTTTCCA 660 

CTTTTCCTTC AACAGGTTCA ATCTGACACG GCCCAAAACT ATACCATATA CTATTCCATA 72 0 

AGAGGTCCTG GAGTTGACCA AGAACCTCGG AATTTATTTT ATGTGGAGAG AGACACTGGA 7 SO 

AACTTGTATT GTACTCGTCC TGTAGATCGT GAGCAGTATG AATCTTTTGA GATAATTGCC 840 

TTTGCAACAA CTCCAGATGG GTATACTCCA GAACTTCCAC TGCCCCTAAT AATCAAAATA 900 

GAGGATGAAA ATGATAACTA CCCAATTTTT ACAGAAGAAA CTTATACTTT TACAATTTTT 960 

GAAAATTGCA GAGTGGGCAC TACTGTGGGA CAAGTGTGTG CTACTGACAA AGATGAGCCT 1020 

GACACGATGC ACACACGCCT GAAGTACTCC ATCATTGGGC AGGTGCCACC ATCACCCACC 10 80 

CTATTTTCTA TGCATCCAAC TACAGGCGTG ATCACCACAA CATCATCTCA GCTAGACAGA 1140 

GAGTTAATTG ACAAGTACCA GTTGAAAATA AAAGTACAAG ACATGGATGG TCAGTATTTT 12 00 

GGTCTACAGA CAACTTCAAC TTGTATCATT AACATTGATG ATGTAAATGA CCACTTGCCA 1260 

ACATTTACTC GTACTTCTTA TGTGACATCA GTGGAAGAAA ATACAGTTGA TGTGGAAATC 132 0 

TTACGAGTTA CTGTTGAGGA TAAGGACTTA GTGAATACTG CTAACTGGAG AGCTAATTAT 13 80 

ACCATTTTAA AGGGCAATGA AAATGGCAAT TTTAAAATTG TAACAGATGC CAAAACCAAT 14 40 

GAAGGAGTT C TTTGTGTAGT TAAGCCTTTG AATTATGAAG AAAAGCAACA GATGATCTTG 1500 

CAAATTGGTG TAGTTAATGA AGCTCCATTT TCCAGAGAGG CTAGTCCAAG ATCAGCCATG 1560 

AGCACAGCAA CAGTTACTGT TAATGTAGAA GATCAGGATG AGGGCCCTGA GTGTAACCCT 162 0 

CCAATACAGA CTGTTCGCAT GAAAGAAAAT GCAGAAGTGG GAACAACAAG CAATGGATAT 1690 

AAAGCATATG ACCCAGAAAC AAGAAGTAGC AGTGGCATAA GGTATAAGAA ATTAACTGAT 1740 

CCAACAGGGT GGGTCACCAT TGATGAAAAT ACAGG AT CAA TCAAAGTTTT CAGAAGCCTG 1800 

GATAGAGAGG CAGAGACCAT CAAAAATGGC ATATATAATA TTACAGTCCT TGCATCAGAC 1860 

CAAGGAGGGA GAACAT GT AC GGGGACACTG GGCATTATAC TTCAAGACGT GAATGATAAC 1920 

AGCCCATTCA TACCTAAAAA GACAGTGATC ATCTGCAAAC CCACCATGTC ATCTGCGGAG 19B0 

ATTGTTGCGG TTGATCCTGA TGAGCCTATC CATGGCCCAC CCTTTGACTT TAGTCTGGAG 204C 

AGTTCTACTT CAGAAGTACA GAGAATGTGG AGACTGAAAG CAATTAATGA TACAGCAGCA 2100 

CGTCTTTCCT ATCAGAATGA TCCTCCATTT GGCTCATATG TAGTACCTAT AACAGTGAGA 2160 

GAT AG ACT TG GCATGTCTAG TGTCACTTCA TTGGATGTTA CACTGTGTGA CTGCATTACC 2 2 20 

GAAAATGACT GCACACATCG TGTAGATCCA AGGATTGGCG GTGGAGGAGT ACAACTTGGA 2 2 BO 

AAGTGGGCCA TCCTTGCAAT ATTGTTGGGC ATAGCATTGC TCTTTTGCAT CCTGTTTACG 2 340 

CTGGTCTGTG GGGCTTCTGG GACGTCTAAA CAACCAAAAG TAATTCCTGA TGATTTAGCC 2 4 00 

CAGCAGAACC TAATTGTATC AAACACAGAA GCTCCTGGAG ATGACAAAGT GTATTCTGCG 2 460 

AATGGCTTCA CAACCCAAAC TGTGGGCGCT TCTGCTCAGG GAGTTTGTGG CACCGTGGGA 2 52 0 

TCAGGAATCA AAAACGGAGG TCAGGAGACC ATCGAAATGG TGAAAGGAGG ACACCAGACC 2580 

TCGGAATCCT GCCGGGGGGC TGGCCACCAT CACACCCTGG ACTCCTGCAG GGGAGGACAC 2 540 

ACGGAGGTGG ACAACTGCAG ATACACTTAC TCGGAGTGGC ACAGTTTTAC TCAGCCCCGT 2700 

CTTGGTGAAA AAGTGTATCT GTGTAATCAA GATGAAAATC ACAAGCATGC CCAAGACTAT 2 760 

GTCCTGACAT ATAACTATGA AGGAAGAGGA TCGGTGGCTG GGTCTGTAGG TTGTTGCAGT 2 820 

GAACGACAAG AAGAAGATGG GCTTGAATTT TTGGATAATT TGGAGCCCAA ATTTAGGACA 2 880 

CTAGCAGAAG CATGCATGAA GAGATGAGTG TGTTCTAATA AGTCTCTGAA AGCCAGTGGC 2 940 

TTTATGACTT TTAAAAAAAA TTACAAACCA AGAATTTTTT AAAGCAGAAG ATGCTATTTG 3000 

TGGGGGTTTT TCTCTCATTA TTTGGATGGA ATCTCTTTGG TCAAATGCAC ATTTACAGAG 3 060 

AGACACTATA AACAAGTACA CAAATTTTTC AATTTTTACA TATTTTTAAA TTACTTATCT 3120 

TCTATCCAAG GAGGTCTACA GAGAAATTAA AGTCTGCCTT ATTTGTTACA TTTGGGTATA 3180 

ATGACAACAG CCAAT T TATA GTGCAATAAA ATGTAATTAA TTCAAGTCCT TATTATAGAC 3240 

TATTTGAAGC ACAACCTAAT GGAAAATTGT AGAGACCTTG CTTTAACATT ATCTCCAGTT 3 300 

AATTAAGTGT TCATGTGGTG CTTGGAAACT GTTGTTTTCC TGAACATCTA AAGTGTGTAG 3360 

I TGCTATTATT TTATTCTTGT AATGTGACCT TTTCACTGTG CAAAGGGAGA 342 0 
A GGCATTGACT ATTACAATTT CATT 

Seq ID NO: 517 Protein sequence 
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KKRHTKEKVL RRAKRRWAPI 
EPRNLFYVER DTGNLYCTRP 
PIFTEETYTF TIFENCRVGT 
TGVITTTSSQ LDRELIDKYQ 
VTSVEENTVD VEILRVTVED 
KPLNYEEKQQ MILQIGWNE 
KENAEVGTTS NGYKAYDPET 
KNGIYNITVL ASDQGGRTCT 
EPIHGPPFDF SLESSTSEVQ 
VTSLDVTLCD CITENDCTHR 
TSKQPKVIPD DLAQQNLIVS 
QETIEMVKGG HQTSESCRGA 
CNQDENHKHA 



PCSMLENSLG 
VDREOYESFE 
TVGQVCATDK 
LKIKVQDMDG 



PFPLFLQQVQ SDTAQNYTIY 
I I AFATTPDG YTP3LPLPLI 
DEPDTMHTRL KYSIIGQVPP 
QYFGLQTTST 



RSSSGIRYKK 
GTLGIILQDV 
RMWRLKAIND 



SAMSTATVTV KVEDQDEGPE C 



NTEAPGDDKV 
GRGSVAGSVG 



TVGSGIKNGG 
QPRLGEKVYL 
FRTLAEACMK 
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20 
25 
30 
35 
40 
45 



c Acid Accession #: NM_004949.1 
sequence: 202.. 2745 



, AGAAAAGCAC 
! CCCAGACTGC 

GCTCCGGCCG CGGCCCTCGC CCCGCGGAGC CCTCCTACCC CGGCCCGACG CTCGGCCCGC 
C GAGCCCTCTC CATGGAGGCA GCCCGCCCCT CCGGCTCCTG GAACGGAGCC 



r GGATGAGAGG CAGGCGCTTC AGAGAAGCTA A 
CTCCGCCTCG 
CCCGCGGAGC 
CATGGAGGCA 



AATGTGACAT 



CTGAAAGAGT 
TTGGAGGATG 
TTTACCATAT 
GAGCATCAAA 
AAGAGAAGAT 
CTTTTCCTTC 



GCTTTACAGC 
GTTCAGTCTA 
TACTTTCCAA 
CAAAGGTCCT 
GGGCTCCAAT 
AACAGGTTCA 
GAGTTGACCA 



GAGTTAATTG ACAAGTACCA 
CAACTTCAAC 



ACATTTACTC 
TTACGAGTTA 
ACCATTTTAA 
GAAGGAGTTC 
CAAATTGGTG 
AGCACAGCAA 
CCAATACAGA 



CTCCAAACTA 
TGCAAATCTA 
TACAACAAAT 
CACTGAGAAC 
AAAGAAAAGA 
TCCTTGTTCG 
ATCTGACACG 
AGAACCTCGG 
TGTAGATCGT 
GTATACTCCA 
CCCAATTTTT 
TACTGTGGGA 
GAAGTACTCC ATCATTGGGC 
TACAGGCGTG ATCACCACAA 
GTTGAAAATA AAAGTACAAG 
TTGTATCATT AACATTGATG 
TGTGACATCA GTGGAAGAAA 
TAAGGACTTA GTGAATACTG 
AAATGGCAAT TTTAAAATTG 
TAAGCC 



AACTTGTTGG 
GTGATCCTGA 
TGTCCTCGGA 



TAGAGTTAAC . 
CTTCCAAATT 
GAAGAGAAGT 



CATACTAAAG 
ATGCTAGAAA 
GCCCAAAACT 
AATTTATTTT 
GAGCAGTATG 
GAACTTCCAC 



AAAAAGTTCT 
ACTCCTTGGG 
ATACCATATA 
ATGTGGAGAG 
AAT CTTTTGA 



AAGGCGCGCC 
TCCTTTTCCA 
CTATTCCATA 



CTTATACTTT 
CTACTGACAA 
AGGTGCCACC 
CATCATCTCA 



CAGTTACTGT T 



ACCCAGAAAC 
GGGTCACCAT 
CAGAGACCAT 
GAACATGTAC 



AGTGGCATAA 



CGTCTTTCCT 
GATAGACTTG 
- GAAAATGACT 
AAGTGGGCCA 
CTGGTCTGTG 
CAGCAGAACC 
AATGGCTTCA 
TCAGGAATCA 
TCGGAATCCT 



ATCAGAATGA 
GCATGTCTAG 
GCACACATCG 
TCCTTGCAAT 
GGGCTTCTGG 
TAATTGTATC 
CAACCCAAAC 



CAAAAATGGC 
GGGGACACTG 
GACAGTGATC 
TGAGCCTATC 
GAGAATGTGG 



CATGGCCCAC 



ATGTAAATGA 
ATACAGTTGA 
CTAACTGGAG 
TAACAGATGC 
AAAAGCAACA 
CTAGTCCAAG 
AGGGCCCTGA 
GAACAACAAG 
GGTATAAGAA 
TCAAAGTTTT 
TTACAGTCCT 
TTCAAGACGT 
CCACCATGTC 
CCTTTGACTT 
CAATTAATGA 



GATAATTGCC 
AAT CAAAAT A 
TACAATTTTT 
AGATGAGCCT 
ATCACCCACC 
GCTAGACAGA 
TCAGTATTTT 



ATCAGCCATG 



TGCATCAGAC 
GAATGATAAC 
ATCTGCGGAG 
TAGTCTGGAG 
TACAGCAGCA ! 



CTTGGTGAAG 
GTATCTGTGT 
CTATGAAGGA 
AGATGGGCTT 
CATGAAGAGA 
AAAAAATTAC 



ACAACTGCAG 
AATCCATTAG 
AATCAAGATG 



GACGTCTAAA 
AAACACAGAA 
TGTGGGCGCT 
TCAGGAGACC 
TGGCCACCAT 
ATACACTTAC 
AGGACACACT 



CACTGTGTGA 
GTGGAGGAGT 
TCTTTTGCAT 
TAATTCCTGA 



TCTGCTCAGG 
ATCGAAATGG 
CACACCCTGG 



TGAGTGTGTT 
AAACCAAGAA 
GATGGAATCT 



TGGCTGGGTC 
ATAATTTGGA 
CTAATAAGTC 



CTTTGGTCAA 
TTTACATATT 
AATTAAAGTC TGCCTTATTT 
AATTAATTCA 
ACCTTGCTTT 
TTTTCCTGAA 
TGACCTTTTC 



CTGATTAAAA 
GCATGCCCAA 
TGTAGGTTGT 
GCCCAAATTT 
TCTGAAAGCC 
CAGAAGATGC 
ATGCACATTT 
TTTAAATTAC 
GTTACATTTG 



GAGTTTGTGG 
TGAAAGGAGG 
ACTCCTGCAG 
ACAGTTTTAC 



CTGCATTACC 
ACAACTTGGA : 
CCTGTTTACG 
TGATTTAGCC 
GTATTCTGCG : 
CACCGTGGGA 
ACACCAGACC 
GGGAGGACAC 



TTGACTATTA CAATTTCATT 



AACATTATCT 
CAT CT AAAGT 
ACTGTGCAAA 



GAAAGAAAGT 
GACTATGTCC TGACATATAA 
TGCAGTGAAC GACAAGAAGA 
AGGACACTAG CAGAAGCATG 
AGTGGCTTTA TGACTTTTAA 
TATTTGTGGG GGTTTTTCTC 
ACAGAGAGAC ACTATAAACA 
TTATCTTCTA 
GGTATAATGA 
ATAGACTATT 
CCAGTTAATT 
GTGTAGACTG 
GGGAGATTTC 



CAACAGCCAA 



MEAARPSGSW NGALCRLLLL TLAILIFASD ACKNVTLHVP SKLDAEKLVG 
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ANLIHSSDPD FQILEDGSVY TTNTILLSSE KRSFTILLSN TENQEKKKIF VFLEHQTKVL 120 

KKRHTKEKVL RRAKRRWAPI PCSMLENSLG PFPLFLQQVQ SDTAQNYTIY YSIRGPGVDQ 180 

EPRNLFYVER DTGNLYCTRP VDREQYESFE IIAFATTPDG YTPELPLPLI IKIED END NY 240 

PIFTEETYTF TIFENCRVGT TVGQVCATDK DEPDTMHTRL KYSIIGQVPP SPTLFSMHPT 300 

TGVITTTSSQ LDRELIDKYQ LKIKVQDMDG QYFGLQTTST CIINIDDVND HLPTFTRTSY 360 

VTSVEENTVD VEILRVTVED KDLVNTANWR ANYTILKGNE NGNFKIVTDA KTMEGVLCW 42 0 

KPLNYEEKQQ MILQIGWNE APFSREASPR SAMSTATVTV NVEDQDEGPE CNPPIQTVRM 480 

KENAEVGTTS NGYKAYDPET RSSSGIRYKK LTDPTGWVTI DENTGSIKVF RSLDREAETI 540 

KNGIYNITVL ASDQGGRTCT GTLGIILQDV NDNSPFIPKK TVIICKPTMS SAEIVAVDPD 600 

EPIHGPPFDF SLESSTSEVQ RMWRLKAIND TAARLSYQND PPFGSYWPI TVRDRLGMSS 660 

VTSLDVTLCD CITENDCTHR VDPRIGGGGV QLGKWAILAI LLGIALLFCI LFTLVCGASG 72 0 

TSKQPKVIPD DLAQQNL I VS NTEAPGDDKV YSANGFTTQT VGASAQGVCG TVGSGIKNGG 780 

QETIEMVKGG HQTSESCRGA GHHHTLDSCR GGHTEVDNCR YTYSEWHSFT QPRLGEESIR 840 
GHTLIKN 

Nucleic Acid Accession^: NM_03254S.l 
Coding sequence: 46.. 71 8 

1 11 21 31 41 SI 

I I I I I I 

AAACTGATCT TCAATGCACT AAGAGAAGGA GACTCTCAAA CCAAAAATGA CCTGGAGGCA 60 

CCATGTCAGG CTTCTGTTTA CGGTCAGTTT GGCATTACAG ATCATCAATT TGGGAAACAG 12 0 

CTATCAAAGA GAGAAACATA ACGGCGGTAG AGAGGAAGTC ACCAAGGTTG CCACTCAGAA 180 

GCACCGACAG TCACCGCTCA ACTGGACCTC CAGTCATTTC GGAGAGGTGA CTGGGAGCGC 2 40 

CGAGGGCTGG GGGCCGGAGG AGCCGCTCCC CTACTCCCGG GCTTTCGGAG AGGGTGCGTC 30 0 

CGCGCGGCCG CGCTGCTGCA GGAACGGCGG TACCTGCGTG CTGGGCAGCT TCTGCGTGTG 360 

CCCGGCCCAC TTCACCGGCC GCTACTGCGA GCATGACCAG AGGCGCAGTG AATGCGGCGC 42 0 

CCTGGAGCAC GGAGCCTGGA CCCTCCGCGC CTGCCACCTC TGCAGGTGCA TCTTCGGGGC 48 0 

CCTGCACTGC CTCCCCCTCC AGACGCCTGA CCGCTGTGAC CCGAAAGACT TCCTGGCCTC E40 

CCACGCTCAC GGGCCGAGCG CCGGGGGCGC GCCCAGCCTG CTACTCTTGC TGCCCTGCGC 600 

ACTCCTGCAC CGCCTCCTGC GCCCGGATGC GCCCGCGCAC CCTCGGTCCC TGGTCCCTTC 66 0 

CGTCCTCCAG CGGGAGCGGC GCCCCTGCGG AAGGCCGGGA CTTGGGCATC GCCTTTAATT 72 0 

TTCTATGTTG TAAATAATAG ATGTGTTTAG TTTACCGTAA GCTGAAGCAC TGGGTGAATA 7 80 

TTTTTATTGG GTAATAAATA TTTTCATGAA AGCGCCAAAA AAAAAAAAAA AAAAAAAAAA 840 



MTWRHHVRLL FTVSLALQII NLGNSYQREK KNGGHEEVTK VATQKHRQSP LNWTSSHFGE 

VTGSAEGWGP EEPLPYSRAF GEGASARPRC CRNGGTCVLG SFCVCPAHFT GRYCEHDQRR 

SECGALEHGA WTLRACHLCR CIFGALHCLP LQTPDRCDPK DFLASHAHGP SAGGAPSLLL 

LLPCALLHRL LRPDAPAHPR SLVPSVLQRE RRPCGRPGLG URL 

Seq ID NO; 622 DNA sequence 

Nucleic Acid Accession #: FGENESH predicted 

Coding sequence: 1..390 

1 11 21 31 41 51 

I I I I I I 

ATGAGGTTCA GTGTCTCAGG CATGAGGACC GACTACCCCA GGAGTGTGCT GGCTCCTGCT 
TATGTGTCAG TCTGTCTCCT CCTCTTGTGT CCAAGGGAAG TCATCGCTCC CGCTGGCTCA 
GAACCATGGC TGTGCCAGCC GGCACCCAGG TGTGGAGACA AGATCTACAA CCCCTTGGAG 
CAGTGCTGTT ACAATGACGC CATCGTGTCC CTGAGCGAGA CCCGCCAATG TGGTCCCCCC 
TGCACCTTCT GGCCCTGCTT TGAGCTCTGC TGTCTTGATT CCTTTGGCCT CACAAACGAT 
TTTGTTGTGA AGCTGAAGGT TCAGGGTGTG AATTCCCAGT GCCACTCATC TCCCATCTCC 
AGTAAATGTG AAAGAGGCCG GATATGTTAG 

Seq ID NO: 62 3 Protein sequence 



MRFSVSGMRT DYPRSVLAPA YVSVCLLLLC PREVIAPAGS EPWLCQPAPR CGDXIYNPLE 
QCCYNDAIVS LSETRQCGPP CTFWPCFELC CLDSFGLTND FWKLKVQGV NSQCHSSPIS 
SKCERGRIC 

Seq ID NO: 624 DNA sequence 
Nucleic Acid Accession #: M18728.1 
Coding sequence: 51.. 1085 

1 11 21 31 41 51 

I I I I I I 

GGAGCTCAAG CTCCTCTACA AAGAGGTGGA CAGAGAAGAC AGCAGAGACC ATGGGACCCC 
CCTCAGCCCC TCCCTGCAGA TTGCATGTCC CCTGGAAGGA GGTCCTGCTC ACAGCCTCAC 
TTCTAACCTT CTGGAACCCA CCCACCACTG CCAAGCTCAC TATTGAATCC ACGCCATTCA 
ATGT CGCAG A GGGGAAGGAG GTTCTTCTAC TCGCCCACAA CCTGCCCCAG AATCGTATTG 
GTTACAGCTG GTACAAAGGC GAAAGAGTGG ATGGCAACAG TCTAATTGTA GGATATGTAA 
TAGGAACTCA ACAAGCTACC CCAGGGCCCG CATACAGTGG TCGAGAGACA ATATACCCCA 
ATGCATCCCT GCTGATCCAG AACGTCACCC AGAATGACAC AGGATTCTAT ACCCTACAAG 
TCATAAAGTC AGATCTTGTG AATGAAGAAG CAACCGGACA GTTCCATGTA TACCCGGAGC 
TGCCCAAGCC CTCCATCTCC AG CAACAACT CCAACCCCGT GGAGGACAAG GATGCTGTGG 
CCTTCACCTG TGAACCTGAG GTTCAGAACA CAACCTACCT GTGGTGGGTA AATGGTCAGA 
GCCTCCCGGT CAGTCCCAGG CTGCAGCTGT CCAATGGCAA CATGACCCTC ACTCTACTCA 
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GCGTCAAAAG GAACGATGCA 
ACCGCAGTGA CCCAGTCACC 
CCTCAAAGGC CAATTACCGT 
ACCCACCTGC ACAGTACTCT 
TCTTTATCCC CAACATCACT 
CAGCCACTGG CCTCAATAGG 
TCCTCTCAGC TGTGGCCACC 
TATAGCAGCC CTGGTGTATT 
GAATTCTTCT AGCTCCTCCA 
CTCTGCTCCT GAAGCCCTAT 
ACCCTCAGGC CTGAGGTGTG 
3 GTGAGAAATT 



GGATCCTATG 
CTGAATGTCC 
CCAGGGGAAA 



AATGTGAAAT 
TCTATGGCCC 
ATCTGAACCT 



GTGAATAATA GCGGATCCTA 



GTCGGCATCA 
TTCGATATTT 
ATCCCATTTT 
ATGCTGGAGA 
TGCCACTCAG 
GACGACTTCA 



TGCCTCTTTC GCTTGGCAGG 
GGGTAACTTA ACAGAGTGTC AGATCTATCT 



CAGGAAGACT 
ATCCCATGGA 
TGGACAACTC 
AGACTTCACC 
CACTATGGAC 
ACCCCCTTTT 
CATTAGTATT 
TGTCAATCCC 



GCGAGTGCCA 
ACCATTTCCC 
CTCCTGCCAC GCAGCCTCTA 
CCAGCAATCC ACACAAGAGC 
TATGTGCCAA GCCCATAACT 
AGTCTCTGGA AGTGCTCCTG 
GCTGGCCAGG GTGGCTCTGA 
GGCAGAITGG ACCAGACCCT . 

ACAAGGTCTG 
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TAACTAGAGA CAGTCAAACT 
AAGATGTCAA 
TGCTTATGCC 
TCACMGAAG TAGCTTCAGA 
ATAAAATAAG 



AAATGTACAG 



GGAGGAGTCT 
GCTGAGACTA 
CTGACTCATT 
CTCTTGGTAT 
CTCTAAAAGC 
GGCTGGAATT 
ATAAAAGCCC 
TCTCACCTAG 
GTTAAGGAAG 




TTTACAAAAA 
TGTTTCCTTG 
CTATCACTGT 
TAGCTCTATA 



CAAATGGTGG 
GTGAGCGCAT 
AAGATAGATC 
TTCCAGTCTA 



GAGAAATGTG 



TGAGCCAGTG 
CAATTAAAAA 
CTTGAGTTAG 
ACTAATCTGA 
ACAAAACCCA 
TGGTGCTGCT 



CAGCCATCAA 
TCATCAGGAG 
TAGCACTAAT 



CATAATACAG 



AACATCATAA 
GCTTTAAGAT 
CTACATACTC 
CAATTTAAAA 
AAGTCCCCTC 
TGTATTTATT 
ATTGTATTGC 
AATCACAAAT 



TTGGTCACAC 



AAAAAAAAGA 
TACTTTAACT 
TCTGTGGTTC 



AAAAGCCAAT 



I 



21 



S TPFNVAB 



51 



MGPPSAPPCR LHVPWKEVLL TASLLTPWNP PTTAKLTIES TPFNVAEGKE VLLLAHNLPQ 
NRIGYSWYKG ERVDGNSLIV GYVIGTQQAT PGPAYSGRET IYPNASLLIQ NVTQNDTGFY 
TLQVIKSDLV NEEATGQFHV YPELPKPSIS SKMSNPVSDK DAVAFTCEP3 VQNTTYLWNV 
NGQSLPVSPR LQLSNGNMTL TLLSVKRNDA GSYECEIQNP ASANRSDPVT LNVLYGPDVP 
TISPSKANYR PGENLNLSCH AASNPPAQYS WFINGTFQQS TQELFIPNIT VNNSGSYMCQ 
AHNSATGLNR TTVTMITVSG SAPVLSAVAT VGITIGVLAR VALI 

Seq ID NO: 626 DNA sequence 



GGAGCTCAAG CTCCTCTACA 



TTCTAACCTT 
ATGTCGCAGA 
GTIACAGCTG 



CTGGAACCCA 
GGGGAAGGAG 
GTACAAAGGC 
ACAAGCTACC 



TTGCATGTCC 



CAGAGAAGAC AGCAGAGACC ATGGGACCCC 
CCTGGAAGGA GGTCCTGCTC ACAGCCTCAC 
ACGCCATTCA 



TCATAAAGTC AGATCTTGTG 
C CTCCATCTCC 
G TGAACCTGAG 
T CAGTCCCAGG 
G GAACGATGCA 
A CCCAGTCACC 
: CAATTACCGT 



AACGTCACCC 



CAACCGGACA G 



CCTCAAAGGC C 



TCTTTATCCC 
CAGCCACTGG 
TCCTCTCAGC 
TATAGCAGCC 
GAATTCTTCT 
CTCTGCTCCT 
ACCCTCAGGC 
GCAAACCATG 
AACAAGACTC 
TGCCTCTTTC 
GGGTAACTTA 
AGATCCTTTA 
AAATGTACAG 
TTTAATTCAA 
GGAGGAGTCT 



CAACATCACT 
CCTCAATAGG 
TGTGGCCACC 



CTGAATGTCC 
CCAGGGGAAA 
TGGTTTATCA 



ACCACAGTCA 
GTCGGCATCA 
TTCGATATTT 
ATCCCATTTT 
ATGCTGGAGA 



GTGAGAAATT G 



GCTTGGCAGG 
ACAGAGTGTC 
GTGCACCCAG 
TGGTCCTTTT 
CCCAGCCATG 
GTGCAGTTTC 



AGATCTATCT 



CAATGCCAAA 
TGACACTTGT 
ATTAACAAAT 



CAACCTACCT 
CCAATGGCAA 
AATGTGAAAT 
TCTATGGCCC 
ATCTGAACCT 
ATGGGACGTT 
GCGGATCCTA 
CGATGATCAC 
CGATTGGAGT 
CAGGAAGACT 
ATCCCATGGA 
TGGACAACTC 
AGACTTCACC 
CACTATGGAC 
ACCCCCTTTT 
CATTAGTATT 
TGTCAATCCC 
TAGCAGCATC 
CTTCTAGACT 



TACCCGGAGC 
GATGCTGTGG 
AATGGTCAGA 
ACTCTACTCA 
GCGAGTGCCA 
ACCATTTCCC 
GCAGCCTCTA 



AGTCTCTGGA 
GCTGGCCAGG 
GGCAGATTGG 



AATGAAAATT 



\ ATAGTCATAC 



TGTTGAACAT 
GTGCTGCTTG 

TAGTAGTCAT 
CAGCCATCAA 
TCATCAGGAG 
TAGCACTAAT 



AACGTTTTAC 
TTTAACACAG 
CACCTGTTCT 
GCTCCCTACC 
GGCTAAATAC 
GTTAAAATGG 



GCCCATAACT 
AGTGCTCCTG 
GTGGCTCTGA 
ACCAGACCCT 
ACAAGGTCTG 
TAAAGGGAAA 
CAGTCAAACT 
AAGATGTCAA 
TGCTTATGCC 
TAGCTTCAGA 
ATAAAATAAG 



CACTCCCTGT 
AGCTGAACAG 
AATGGGTATC 



ACTCCCTGGT GTAGTGTATT : 
ATAGTGAATG GTCTCTCTTT : 
AACATCATAA CCCATGAAGG 
GCTTTAAGAT TTGGTCACAC 



427 



WO 02/086443 

TCTCACCTAG GTGAGCGCAT TGAGCCAGTG GTGCTAAATG CTACATACTC CAACTGAAAT 2 22 0 

GTTAAGGAAG AAGATAGATC CAATTAAAAA AAATTAAAAC CAATTTAAAA AAAAAAAAGA 22 80 

ACACAGGAGA TTCCAGTCTA CTTGAGTTAG CATAATACAG AAGTCCCCTC TACTTTAACT 2 340 

TTTACAAAAA AGTAACCTGA ACTAATCTGA TGTTAACCAA TGTATTTATT TCTGTGGTTC 240 0 

TGTTTCCTTG TTCCAATTTG ACAAAACCCA CTGTTCTTGT ATTGTATTGC CCAGGGGGAG 24S0 

CTATCACTGT ACTTGTAGAG TGGTGCTGCT TTAATTCATA AATCACAAAT AAAAGCCAAT 252 0 
TAGCTCTATA ACT 



PCT/US02/12476 



MDSFSQDVKT RLLIMIRLLP 



SPAWQDDAVI SISQEVASEG NLTEOQIYLV 



SIFNTAVCSN VQWSPSBLDP 



I 



I 



GGAGCTCAAG CTCCTCTACA 
CCTCAGCCCC TCCCTGCAGA 
TTCTAACCTT CTGGAACCCA 
ATGTCGCAGA GGGGAAGGAG 
GTTACAGCTG GTACAAAGGC 
TAGGAACTCA ACAAGCTACC 



AAGAGGTGGA 
TTGCATGTCC 
CCCACCACTG 
GTTCTTCTAC 
GAAAGAGTGG 



TCATAAAGTC 
TGCCCAAGCC 
CCTTCACCTG 



GCGTCAAAAG 



CCTCAAAGGC 
ACCCACCTGC 
TCTTTATCCC 
CAGCCACTGG 
TCCTCTCAGC 
TATAGCAGCC 
GAATTCTTCT 
CTCTGCTCCT 



AGATCTTGTG 
CTCCATCTCC 
TGAACCTGAG 
CAGTCCCAGG 
GAACGATGCA 
CCCAGTCACC 
CAATTACCGT 
ACAGTACTCT 
CAACATCACT 
CCTCAATAGG 
TGTGGCCACC 



AACGTCACCC 
AATGAAGAAG 
AGCAACAACT 
GTTCAGAACA 



GGATCCTATG 
CTGAATGTCC 
CCAGGGGAAA 



GAAGCCCTAT 



GTGAATAATA 
ACCACAGTCA 
GTCGGCATCA 
TTCGATATTT 
ATCCCATTTT 
ATGCTGGAGA 



CCTGGAAGGA 
CCAAGCTCAC 
TCGCCCACAA 
ATGGCAACAG 
CATACAGTGG 
AGAATGACAC 
CAACCGGACA 
CCAACCCCGT 
CAACCTACCT 
CCAATGGCAA 
AATGTGAAAT 
TCTATGGCCC 
ATCTGAACCT 
ATGGGACGTT 
GCGGATCCTA 



AGCAGAGACC 
GGTCCTGCTC 
TATTGAATCC 
CCTGCCCCAG 
TCTAATTGTA 
TCGAGAGACA 
AGGATTCTAT 



ACAGCCTCAC 



GGAGGACAAG 



GGATATGTAA 
ATATACCCCA 
ACCCTACAAG 
TACCCGGAGC 
GATGCTGTGG 
AATGGTCAGA 
ACTCTACTCA 



A ACAGAGTGTC 
AGATCCTTTA GTGCACCCAG 
AAATGTACAG TGGTCCTTTT 
TTTAATTCAA CCCAGCCATG 
GGAGGAGTCT GTGCAGTTTC 
GCTGAGACTA 



TACCCTCCTA 



CTCTTGGTAT 
CTCTAAAAGC 
GGCTGGAATT 
ATAAAAGCCC 
TCTCACCTAG 
GTTAAGGAAG 
ACACAGGAGA 
TTTACAAAAA 



GACGACTTCA 
TAAGGCTCTT 
ATGATGCTGT 
AG AT CT AT CT 
TGACTGACAT 
CAGAGTTGGA 
CAATGCCAAA 
TGACACTTGT 
ATTAACAAAT 
TTTTAGTTGG 
ATAGTCATAC 
TGCATGCAGC 



CGATTGGAGT 
CAGGAAGACT 
ATCCCATGGA 
TGGACAACTC 
AGACTTCACC 
CACTATGGAC 
ACCCCCTTTT 
CATTAGTATT 
TGTCAATCCC 
TAGCAGCATC 
CTTCTAGACT 
TAATAGAATT 
TGTTGAACAT 
GTGCTGCTTG 
TTTGTATCTT 
TAGTAGTCAT 



AGATGTCCCC 
CTCCTGCCAC 
CCAGCAATCC 
TATGTGCCAA 
AGTCTCTGGA 
GCTGGCCAGG 
GGCAGATTGG 



A GCGAGTGCCA 720 



AATGAAAATT 
TAACTAGAGA 
AGCTTTTCCC 
AATTTGTCCT 
TCACAAGAAG 



ACCATTTCCC 
GCAGCCTCTA 
ACACAAGAGC 
GCCCATAACT 
AGTGCTCCTG 
GTGGCTCTGA 
ACCAGACCCT 
ACAAGGTCTG 



CACCTGTTCT 



TGCTTATGCC 
TAGCTTCAGA 
ATAAAATAAG 
CCGTGTGTTC 
CACTCCCTGT 



GGCTAAATAC 
GTTAAAATGG 
GCCTAAGGTG 
ACTCCCTGGT 
ATAGTGAATG 



CTACACTCAT 



CAAATGGTGG 



CTATCACTGT 



AAGATAGATC 
TTCCAGTCTA 
AGTAACCTGA 
TTCCAATTTG 
ACTTGTAGAG 



TAGCACTAAT GCTTTAAGAT 
TGAGCCAGTG GTGCTAAATG CTACATACTC 
CAATTAAAAA AAATTAAAAC CAATTTAAAA 
CTTGAGTTAG CATAATACAG AAGTCCCCTC 
ACTAATCTGA TGTTAACCAA TGTATTTATT 
ACAAAACCCA CTGTTCTTGT ATTGTATTGC 
TGGTGCTGCT TTAATTCATA AATCACAAAT 



GTAGTGTATT 
GTCTCTCTTT 
CCCATGAAGG 
TTGGTCACAC 
CAACTGAAAT 
AAAAAAAAGA 



TCTGTGGTTC 



AAAAGCCAAT 



MLTNVFISW LFPCSNLTKP TVLVLYCPGG AITVLVEWCC FNS 



0 DNA sequence 



Coding sequenc 



I 



I 



I 



I 



GCGGCGGGCG CAGACAGCGG CGGGCGCAGG ACGTGCACTA TGGCTCGGGG CTCGCTGCGC 

CGGTTGCTGC GGCTCCTCGT GCTGGGGCTC TGGCTGGCGT TGCTGCGCTC CGTGGCCGGG 

GAGCAAGCGC CAGGCACCGC CCCCTGCTCC C3CGGCAI I' CCTGi ACCTGGAi 

AAGTGCATGG ACTGCGCGTC TTGCAGGGCG CGACCGCACA GCGACTTCTG CCTGGGCTGC 

GCTGCAGCAC CTCCTGCCCC CTTCCGGCTG CTTTGGCCCA TCCTTGGGGG CGCTCTGAGC 

CTGACCTTCG TGCTGGGGCT GCTTTCTGGC TTTTTGGTCT GC 

GAGAAGTTCA CCACCCCCAT AGAGGAGACC GGCGGAGAGG GCTGCCCAC-C TGTGGCGCTG 



CCGCAGGAGA 3 SO 



428 



WO 02/086443 

ATCCAGTGAC AATGTGCCCC CTGCCAGCCG GGGCTCGCCC ACTCATCATT CATTCATCCA 
TTCTAGAGCC AGTCTCTGCC TCCCAGACGC GGCGGGAGCC AAGCTCCTCC AACCACAAGG 
GGGGTGGGGG GCGGTGAATC ACCTCTGAGG CCTGGGCCCA GGGTTCAGGG GAACCTTCCA 
AGGTGTCTGG TTGCCCTGCC TCTGGCTCCA GAACAGAAAG GGAGCCTCAC GCTGGCTCAC 
ACAAAACAGC TGACACTGAC TAAGGAACTG CAGCATTTGC ACAGGGGAGG GGGGTGCCCT 
CCTTCCTTAG GACCTGGGGG CCAGGCTGAC TTGC-GGGGCA GACTTGACAC TAGGCCCCAC 
TCACTCAGAT GTCCTGAAAT TCCACCACGG GGGTCACCCT GGGGGGTTAG GGACCTATTT 
TTAACACTAG GGGCTGGCCC ACTAGGAGGG CTGGCCCTAA GATACAGACC CCCCCAACTC 
CCCAAAGCGG GGAGGAGATA TTTATTTTGG GGAGAGTTTG GAGGGGAGGG AGAATTTATT 
AATAAAAGAA T 
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MARGSLRRLL RLLVLGLWLA LLRSVAGEQA PGTAPC3RGS SKSADLDKCM DCASCRARPH 
SDFCLGCAAA PPAPFRLLWP ILGGALSLTF VLGLLSGFLV WRRCRRREKF TTPIEETGGE 
GCPAVAL IQ 



CAACAGACCT 



AAAGAGCATA 



CATTATCGGG 
GGACTCAGAG 
AGCTCTCATT 
TGTGGAGTTT 
CCCAGCATGA 



CACATCTTTC 
CCCCTAGGCC 
TTATTCACTT 
ACAAGGAAGG 
GCTATGTGGA 



TTCTTATGAA 
CTATTCAAAA 
GGAAAGGAAC 



GGGAGTTCAT A 



GTGAGAGAAG 



TTGTCGTAGA 



G GTGATGTGCT 
CGTCGGAGAC ATGACAGTGC 
ATGGCATTTG TGGGAACAGT 
CAAATCACTG TGGAGACATT 
ATGAATCACG ATGATGGGAG 
GGAGCATCGG GTTCCAGAAA 



CATTTATCGA 
TATAGAGAAA 
TCGAAGAAGA 
CAAGGAAAGG 
CCTGGCAAAC 
ACTGGAGATT 
GGGGAACTTC 
ACAGCTAGTT 
GTGTTCAAGG 
TGCTTCCATT 



CCCTCCTGTG 
GAATGTGAAT 
TGTGCATATG GTGACTGTTG 
GGAAAAACCA GTGAGTGTGA 
CCAGATGTTT TT ATT C AGAA 
GGCATGTGCC AGTATTATGA 
GCCCCCAAAG ATTGTTTCAT 
TTCTCTGGCA ATGAATACAA 



CTTTAGCAGT 
CCTTCTTAAT 
GGTGGACGCT 



TAAAGACTGT 
TGTTCCAGAG 
TGGATATCCT 
TGCTCAATGT 
TGAAGTGAAT 
GAAGTGTGCC 



GAATCATTGT 
CGACTGTTTT 
CCTGCAGAAC 
GCCTCTGAAA 
GGAAGAGCCT 
CCGGTATGTG 
TCAGACTGCT 
TATGTTAAAT 
CAACATAGTT 
TCTTATCACA 
AACTGCAGGA 
TGTGTTTGGA 



TGTGGAGCAA AGAGCTGCAT CATGAATTCA 
TGCAGTGCAG A 
ATICCAAAGC C 
GGGGAAGAGT GTGACTGTGG T 
AGTACCTGTA AGCTTAAATC ATTTGCTGAG 



ATGGATGATG 
GAAACTGCAA 

AGAGCTGTCT 
TATGACATGA 
TACTTGGATA 
TGGACCAATG 
GTGCAGTGGC 



C CCAATATACA 
A TTGCTCTTAG 
GGATTGAACC 
TCTACAAAGA 
AGGATGAAGA 
TGCCACAGAC 
TGGGAAGAAA 
GTATGTATAT 
GAAACCTGAT 
GGGAAAAGTT 



GCGGGATTAA 



TTCTTCTCA 



TTGCTACAAC 



TCTAAAGGTG A 



AGTCGAGGCA CCAAATGTTG 
GGGATGGTTA ACGAAGGCAC 
GTAGATGCTT CTGTTCTGAA 
GTATGTAATA GCAATAAGAA 
ACTAAAGGAT ACGGAGGAAG 
TTGAGGGACG GACTTCTGGT 
TTTATCTTCA TCAAGAGGGA 
ACATAIGAGT CAGATGGCAA 



GTACCAACCT ATGCAGCCAA 
CCGAAAGTAT CATCTCAGGG 
TATAGTTCCC TCACTTGATT 
CTAATACTTT TTTTTTTTCT 
GAAAACAAAA CACCACAAAA 



CATCATTGAA 
TGAACATGTT ATTGCAGTGA 
AGTGT TTAAG TGTTATTCTG 
AACATGTGAT AATCTAATAC 
TTTTTCATCA TGCACGAATT 
CATGAATAAG CAAATATTGT 
A AGTACAAAAT 



TTGTCACTGT 
TGTGGACAGT 
CTTCTTCTTC 
TCAACTGTGG 
AAATCAAGCA 
ACCTCCCAGA 
GCAACCTCAG 
AAACTTAATT 
TTTTTAACCT 
TGATGTTTTC 
CAGACTTCAC 
TGCAGTAAAG 
TCAGTCATCG 
TTCTCAAATT 
AATTTTCTAC 
CTGTGAAAAC 
AATAATCATC 
CTTCAAAAGA 
ATACTAAAAG 



TTCCAGCTAG 
GCTGGAAAGA 
GATGTTCAGA 



GATCAGATGT 



CAATTGTGGT 
AAAGCTTCAG 
TCAAACGCCT 



GGACCTACAT 



TTGAAAAGCC 
TAACACAGAA 
CCAGGGAATT 



AAAAGTGTCA 
GGGCTCCCCC 
ACAATGAAAT 
CCCTTATTGT 
TCAGAAAGAA 
GACAGCCGGG 
TATATGCAAA 
CAAGGCCACC 
CTGCTCCTGC 
AATGTCTTCA 
TTTCTGTTGC 



CTTCCAGTGT 
TGGACATGGG 
AAATTGTGAG 
GAATACTGCA : 



GAGATCACAA 
GAGTGTTCCT 
CAGATTTGCA 



TACAATAACA 
TGCACTAATC 
GTGTAAGATI 
ATTAATGTAG 
GCTGCCAATA 



ACCTCCTTTA 
GGGAACTGAG 
AACTATGAAT 
TGAGTGTGAG 
TTTCCGTTTC 
ATGGATTTTT 
TTTGTCATTA 
TTCCTCATTG 
ATATCTAATA 
TCACTCACTA 
AGATGTCATA 
GTTACTCGCT 
TTTAATATTA 
CTATTTTAAA 



GGCTATAATA AAGCAGGAGC AATTATAAAA 
CTTGAGAATT TCATGAGCAC TTTAAAATCT GAACTTTCAA AGCTTGCTAT TAAATCATTT 
AGAATGTTTA CATTTACTAA GGTGTGCTGG GTCATGTAAA ATATTAGACA CTAATATTTT 
CATAGAAATT AGGCTGGAGA AAGAAGGAAG AAATGGTTTT CTTAAATACC TACAAAAAAG 
TTACTGTGGT ATCTATGAGT TATCATCTTA GCTGTGTTAA AAATGAATTT TTACTATGGC 



A ACCACAATTA 



GAATTTCTAT TATGAATCAT GTGAAAGCAT G 



A GAGAAATTAA 



2220 
2340 



3540 
3SS0 



429 



WO 02/086443 

AGATATGGTA TGGATCGTAA AATTTTAAGC ACTAAAAATT TTTTCATAAC CTTTCATAAT 372 0 

r AATAGGTTTA TTAACTGAAT TTCATTAGTT TTTTAAAAGT GTTTTTGGTT 3780 

VTATACA AATACAACAT TTACAATAAA TAAAATACTT GAAATTCTCA 3840 
A AAAAAAAAAA AAAAA 



Seg I 



PCT/US02/12476 



MGSGARFPSG TLRVRWLLLL 
PYSKQVSYVI QAEGKEHIIH 
EGVHNSSIAL SDCPGLRGLL 
3 EEEPPSMTQL 



RPGFQQTSHL SSYEIITPWR 



QNHCHYRGYV 



DVPEYCNGSS 
IEVNSKGDRP 
WGVDFQLGSD 



GTAGMAFVGT 
IMNSGASGSR 
GTPKECELDP 
QFCQPDVFIQ 
GNCGFSGNEY 
VPDPGMVNEG 
PNCETKGYGG 
KRSQTYESDG 
PPPQPKVSSQ 



NPSSCSAEDF 
CCEGSTCKLK 
NGYPCONNKA 
KKCATGNALC 
TKCGAGKICR 
SVDSGPTYNE 
KNQANPSRQP 
GNLIPARPAP 



DKERYDMMGR NQTAVREEMI 
INIVGGAGDV LGNFVQWREK FLITRRRHDS 
NVFGQITVET FASIVAKELG HNLGMNHDDG 
CLLNIPKPDE AYSAPSCGNK 
CKDCRFLPGG T 
YCYNGMCQYY DAQCQVIF 
GKLQCENVQE IPVFGIVPAI IQTPSRGTKC 
NFQCVDASVL NYDCDVQKKC HGHGVCKSNK 
MNTALRDGLL VFFFLIVPLI VCAIFIFIKR 
GSVPRHVSPV TPPREVPIYA NRFAVPTYAA 
APPLYSSLT 



Seq ID NO: 634 DNA sequence 

Nucleic Acid Accession ft: NMJ302091.1 

Coding sequence: 56.. 503 

1 11 21 31 



AGTCTCTGCT CTTCCCAGCC 
CGGCAGTGAG CTCCCGCTGG 
AGCGGTCCCG CTGCCTGCGG 
CCACTGGGCG GTGGGGCACT 
3 AGCCTGAAGC 



TAATGGGGAA 



GGCCTTGGGC 
AGGTTCAAAA GGCAAAGTTG 
CCCCCAGCTG AACCAGCAAT 
TAAGAGACTG AGTTCTGCAA 
AAATATTTGA CTATTCTGTA 
CTTCTGGTTT AAACTTGTTT 
TTTTTATATC TAGGCTACCT 
TAAAAGCTTA AACACAT 



AAGCAAAGGA 
CTTCGTGGGA 
GTAGACTCTC 
GATAATGATG 
GCATCAGTTC 
TCTTTCATCC 
GCTGTGAACA 
GTTGGTTAGA 



I 

: GCTCCAAGGG CTTCCCGTCG GGACCATGCG 
: GCTGGTCCTC 
CGTGCTGACC 
AAAGAGCACA 
AGAGTACATC 
GAACAGAAAC 
TTCAGAGGAT 
TGCTCCAGGT 
GCCTCTCTCA 



TTGACTAAAT 
ATTGTCGAAA 

TTCAAGGCCC 



TCTCAACGTG 
AAAGAGAAAA 
CAACAAGATT 
TCGTGATTTT 
AGAGTCTTCC 
CGAGCTGTTA 



CTTCTGTTTC 
AAGCTGCAAG 
CTCAACCCM 
TCAAAGATGT 



ACAAAACCCC 
TCCTTGTGCA 
CAAGCAGCAT 
AATTAATGCT 
C C ATT CAC AA 



Seq I 



635 P 



I 

" GRAVPLPAGG 
LSAPGSQREG RNPQLNQQ 

.d Accession 9 *: NM_016 
lence: 265.. 1299 



CTGG CAAAAG 



CCGCACCCCA 
TCGGGGAAGT 
TGCCTCGTGG 
AGCGGAGATG 
GCCACCCTCA 
ACCATCCTCT 
AACACCCAAA 



CGAGGAGGGA GCCCCCTTTG 
CCGAGGCTGG ATTTGGGGGA 
CCCGCGCCTC CCGGTCGCCG 
CCCACTTCCT GTGCTCGCCC 



TCGTGTCTCT 
CCACCTTCCC 
GGTGCACTAT 
ATGCTGGGAA 
CGCAGTACAG 
CGGTGCAGAC 
CCAAAATTGT 
CCTGCATAGC 



CAGGCTGCTG 
CAAAGCTATG 
TGACAACCGG 
TGACAAGTGG 
CATCGAGATC 



CCACAGGAGT 
CGGTCCGGCA 
TGGCCTGGCT 



TTTTCCGAGG 
GTCTGCGCGC 
CGCGCTCGCT 
CTGCCGGAGT 
GCCCTGGAAG 



AGAGATTTCT TCAGATATCT 
AACTGGTAGA CCAGAGCCTA CGGTTACTTG 
TGTGAGTGAA GACGAATACT 



GACACTG CAG 



CCATACATTT C 



CAGGGGACTA 
AGGTCACCGT 
GACAAAAGGG 
ACAAGGATGA 
TCCTCTCAAA 
TGGCCTCCAA 
TCAGCGAGGT 



TGCCGCCACC ACCACCACCA ACACAACAGC 
GAAACACAGC 

TCTTTTCCCA 



ACTCATCTTC TTCAATGT 
CAAGCTGGGC CACACCAA 



' CTGAACATGA 



GGGCAGGCTC- 
GTGAGTGCCA 
AATGGCAACA 
CTCATGGGAC 



AACGGGAAGA 



GGGGGAGAGC 
AAACCGCAGC 
CCTTCTGAGC 
CGAGGGCCCT 
CCTCATTGTG 
AGGGAACAAT 
GAGACACATC 
GGGCATCACC 
GCCCGTGGTA 
GGGTACAGGT 
CTCAGCAGAA 
GAAAGTGGAA 
CTATGGGAAC 
GCTA7TTGGT 
CGTC7GC-CTG 
CTTCCCCACC 
CCGACAGCAA 
AGAAATTTGA 
TTGAAAAT7G 
ACACAGCACA 



430 
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CCCGGCTTGG ACCCACTGCA AGCTGCATCG TGCAACCTCT TTGGTGCCA3 TGTGGGCAAG 162 0 

GGCTCAGCCT CTCTGCCCAC AGACTGCCCC CACGTGGAAC ATTCTGGAGC TGGCCATCCC 1680 

AAATTCAATC AGTCCATAGA GACGAACAGA ATGAGACCTT CCGGCCCAAG CGTGGCGCTT 174 0 

CCGGCCCAAG CGTGGCGCTG CGGGCACTTT GGTAGACTGT GCCACCACGG CGTGTGTTGT 1800 
GAAACGTGAA A 



Seq I 



NO: 



PCT/US02/12476 



I 



WKCLVWSLR 



I 



I 



I 



I 



EC AMDNVTVRQC- ESATLRCTID 
L LSNTQTQYS I EIQNVDVYDE GPYTCSVQTD 
NHPKTSRVHL IVQVSPKIVE ISSDISINEG KNISLTCIAT GRPEPTVTWR HISPKAVGPV 
SEDEYLEIQG ITREQSGDYE CSASNDVAAP WRRVKVTVN YPPYISEAXG TGVPVGQKGT 
LQCEASAVPS AEFQWYKDDK RLIEGKKGVK VENRPFLSKL IFFNVSEHDY GNYTCVASNK 
LGHTNASIML FGPGAVSEVS NGTSRRAGCV W 

Seq ID NO: 638 DNA sequence 
Nucleic Acid Accession »: NM_012261.1 
Coding sequence: 203.. 1045 



25 
30 
35 
40 
45 
50 
55 
60 
65 



GATTTGCTCT 
ACAGAATACG 
CACTCCAGCG 



ACTTCGAGTT 



GCCAGCAGCT 
CGCTCCCTCC 
GCGACTTTGA 
GGCACTGCGA 
CTCCTGATGT 



CTCCCCCTTC 



CGCTCGACAC 
TCTGTCCCCC 
CTCTGGCGGC 



CGAGTCCTAG 
GCCTCTCGCT 
CTCTGCAGCA 



GGCCAGCAAC 
TGAGGTGAAG 
CGCATATGCA 



CACCCCCGCT 
TGATCCGCAG 
TATCTCAGAT 



CGCGATTTAC 
ATCCCAGTAT 
CCAACTGGAT 



ATGCTGGGGA 



CTCAAAATGC 
AGGCTGAGCA 
GTCAGTGCTG 
GGGAAGTCCT 
AAGACGGTCA 

TTGCCCCTGA 
CACGTCCACC 
AAGCACATGG 
CAGGTAGAAC 
ATCAAACAGG 
AGGGGGAGAC 
GGAGGGGAGG 



CAGAGTTTGC 
TGATCACAGA 
GCCACAGCCA 
TCTTTGTAAA 
AAGTGCAGTT 
GGAAGCACAC 
ATGAGTGTCA 



AGCCAAATTT 
ACAGGCCGAT 
GTCGGAGCTG 
GGAAAGCCAC 
TGTCTACGAC 
AGCCAACTCG 
AGCTCAACAA 



C GTTAGGCAGG 



ATCATGGCAG 
ATATTTGTGG 
ATTGTACCTT 
ATCGCATTGA 
CAAGTGTTCT 
AACATGTCCA 
TCCTCGGAGA 
CACCACCTCT 
ACCATTTCAC 
CACATCCAAC 
GTGGATGAGC 
CTCGTCATCA 
GTGCAGATCC 



A GCATCGACAG 240 



GGGTGGATCG 
AGGGACCTGA 
AAACCCACTT 
CTGCCTTGGT 
TGGCCTCTAG 



GGGAGCAACT 
TGGTAACACT 
CTCGGGACAG 
TCCTGCTCCC 



TTGAAAACAT GCTTCTTTGA GGAGGAAACC CCTTTAGGT7 CAGAAGAATA T 
TGCTCCCTTG G ACACAG CTG GCTTATCCTA TACAGTTGTC AATGCACACA G 

< rcCC TGCAGCAAGA CCCCTGAAAG TGATTCATGC TTCTGGCTGG CATTCTGCAT 
GTTTAGTGAT TGTCTTGGGA ATGTTTCACT GCTACCCGCA TCCAGCGACT GCAGCACCAG 



Protein Accession #: NP_036393.1 



I 



I 



I 



MDLQGRGVPS IDRLFVLLML FHTMAQIMAE QEVENLSGLS TNPEKDIFW RENGTTCLMA 
EFAAKFIVPY DVWASNYVDL ITEQADIALT RGAEVKGRCG HSQSELQVFW VDRAYALKML 
FVKESHNMSK GPEATWRLSK VQFVYDSSEK THFKDAVSAG KHTANSHHLS ALVTPAGKSY 
ECQAQQTISL ASSDPOKTVT MILSAVHIQP FDIISDFVFS EEHKCPVDER EQLEETLPLI 
LGLILGLVIM VTLAIYHVHH KMTANQVQIP RDRSQYKHMG 



C TCCCGTCCAG 



CCTCCACCCA 



GTCTCTGCTG 
GTGGTAGCCT 



CCTGCTGACG 
GCTGCGTTGC 
GCAGGTGTTC 
CGGGAAGCAA 
TTTGGACAGT 
CCAGTCTTCA 
GGTTGGTTTT 
TACGCTTCTC 
TTTACTGTTA 
AAAGAATCAC 
CTATCATACA 
TTTATTAGTG 
AGGAAATATT 



GCTCAGGAAC CCGCGAACCC TCTCTTGACC 
CGTGTCCCGG GTCCTTCGGG CTCCTTGTGC 
CCGCCGGGGC CCCTCGCCAG CGCTGGTCCT 
ACTTGTTTAC GCGTTACGCI GAGAGTAAAC 
CCCGCAGGCC CGCAGTGCTC CAAGGTGGAA 
GTTTGTCTGG A 



GCGGAGCAGT TT7CTGGAGA 
TTTCCATTTT CTACATGGAT 
CCTGAAGTTT ACAGCTCAGC 
TTTTACCTGA TAAGTTATTC- 
TGGTTATTAG TCTTTCAATG 
TTCCTTAAAG TC7TACCG.AA 
TGCTGTTGAG GGAGGTATCC 
TTAGTTCTGT TTTCTTGGGG 



TCCCTGGACC 
TCCCTACTTT 
TAATGAAGTA 



AATATGTTAC 



431 
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TCTTTACCCT AGGATGCTAT 
TTATCTGTGC AGAATATATT 
CTAATATATT CTCTTCCTAT 
CATGATTTAC TCATTAAACT 
ATTCTGGTCA CTAAATATAC 
TGATTGCTAA TTTACATAGA 
AATGATCTGT GCTCTGCAAA 
CATTTAGTCC TCAAAATATA 
TTTAAAGGTT TTGACCATTT 
AAATTGCACT TTTATTTTTT 
TGGAGAAACA 



Seq I 



Seq ID NO: 642 DNA sequenc 
Nucleic Acid Accession #: 
Coding sequence: 27.. 809 



TTGATTTTGT 



GTTTTGAAAA 
TACAGCATTG 
TGTTATGAGG 
CCTGTGTGTC 



CTGTATTAGA ACACTGGGTG TGTCATACCG 
GAATTTCTAA AAATTTAAGT TCTC-TAAGGG 
GTTTGATGTC TTCTTAGTAT GGCATAATGT 
ATGCTATTTT TTCACTATAG GATGACTATA 
GATGAAGAAG CCCAAAAACA GATAAATTCC 
CTTGGTTTTT TAAATAAAAG CAAAATTAAC 
TATATTTGAA CAATTTGAAT ATAAATTCAT 
CTAAGATTTT CAGATATCTA TTGTGGATCT 
AATTATACAT GTATCACATT CACTATATTA 
ATGTTGGTTT TTGGTACTTG TATTGTCATT 
AAAAAAAAAA AAAAAAA 
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TCCGGAGCCA © 



CGGCCTTTTG 
GCCGGTAAAG 
TCCTCGCCGC 



3 GGCAGCATGG 



CCCAGCTTGT 
ACGACGGCCC 
CCGAGCTGTT 
TGGCAGCCCC 



GCAGGAGGCT 
CCCCCGCAAC 
GCTCGCTCGC 



GAACCCCGCG 
TTCCGGCGGT 
CTGGCGCATC 



CGGGGTCGCC GCTGCTCTGG GGGCCGCGGG 
TGCTCGGCCT GTTTCGGCCG CCCCCCGCGC 
GCCTAAGCGC AGCGTC-CCG CCCTTGGCTG 



TGCCTGCACG CCGCCTCTTG 
CAGAAGTGCC CCCGCCATCC 
TTACCCCGGC CAGCCAGCCC 
GATCTGAGC 



GATGCTGAGG 
CGCCGTGCCG 
CCACCCTGAG 



C CCGGTCTACG 540 



AGGCAGGCGA CGAGACACCC GACGTGGACC 
AAGCGCGGAC TCCGAGGGGG 
TGTGGGCTCT GAGCTGCCCC 
TGAAACGCCT AGAGACCCCG GCGCCCCAGG 
CACTGCCCGG ATCCCGTGCA CCCTGGGACC 
ACTTCTCCCC GCCAGCACGT CCAGAGCAAC 
GGATCCCTAC CCCCTGGCCC ACAATAACAT 



I I I I I I 

MAGS PLLWGP RAGGVGLLVL LLLGLFRPPP ALCARPVKEP RGLSAASPPL AETGAPRRFR 

RSVPRGEAAG AVQELARALA HLLEAERQER ARAEAQEAED QQARVLAQLL RVWGAPRNSD 

D APAAQLARAL LRARLDPAAL AAQLVPAPVP AAALRPRPPV YDDGPAGPDA 

tiLRYLLG RILAGSADSE GVAAPRRLRR AADHDVGSEL PPEGVLGALL 
? QVPARRLLPP 



CTGCCGACTT 
GTTGGCCTCC 
TCCCCTCGAC 
TAGGGTGGTT 
CTAAGCTGAT 
TGTCCCGGAG 



CCTCCCCCTG 
GTCTTTGCCC 
CTGCCCACCT 



TTATGCAGCA 



I 

TTGCTGGCAT 
GCTGCTCCGC 
GTGGAAGCAA 
TACCCTCCCA 
CTTCGGGCTT 
GAAGCCCCAC 
AGCCCTTGCA 



CCCGAGCTTC 
AGACGGGGCT 
CTGCGCTGAT 
CAGATCCAGC 



I 



I 



GAGCCCTCTC 



GGCCGTAGGG 



AGGAGGTGCT 
GCCCTGAGAT 
GGGTCCGCCT GCTAGGCCTG CGGAAAACGT 



CTCCCTTGCC AGCCAGGACG 
GCAAAGCTGC AACTAATGGT 
TGATGCGCCA CAGACTTTTT 
ATCACCCAGT GAATGTACAT 
TGATTGTGTT TGGC7CTTCG 
GAAACAAAAG CTCTTTTCTT 
TCCAGTCGCC GCCGGGCCCT 



T GCCGAGCGGT G 



CCTGTGCCAG 
TTTCAGGTGG 
GCTCAGTTGA TTCAATAGAA 
TTAATACCCA GGTGACACCA 
ATTTTATGCT 



T ATGTGCGGCT 
3 CGAGGTCCCG 
Z CAAGGTGAAG 



CTAGAAAAAT 
AAACAGTTTC ACCATACATT 
ACAATTTAGA CTGCATGCCT 
TCACTGAGTT TGAGAAAGCA 
AAGGAGGTTT TGACGCCATG 
AAGAGGCTAA AAGATTGCTG 



CCTCTGAAGA 
AATAATATAG 
TCCCGTGACT 
AGCATCCACC 
CCCCATGGAT 
GTTCATAGAC 
CTTCAGGCAG 
CTGGTGATGA 



CGGCCCTGGC TTTTTTTACC GCTGCATTTG 
CCTCGTTCCT CTGGGCAGCC TGGC-TGTTT7 
ACAATAGATG TGCATCTTCA AATGCAGCAT 
AATGTGGATG GTGTGTTCAA GAGGATTTCA 
ATATTGTTTC CAATTTAATA AGCAAAGGCT 
TGCATGTTAT AATACCCACT GAAAATGAAA 
CTATCCAGCT GCGTCCAGGA GCCGAAGCTA 
AATATCCTGT GGATCTTTAT TATCTTGTTG 
AAAAATTAAA TTCCGTTGGA A 



ACAGAGAACA 
GATACACCAG 
CTGTCTGTGA AAGTCATATC GGATGGCGAA 
CAGATCAGAC GTCTCATCTC GCTCTTGATA 
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10 
15 

20 
25 
30 
35 
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GCAAATTGGC AGGCATAGTG G 
ACGTCAAATC GACAACCATG G 



C ACCATTGCTG 



ACGGAAACTG 
CACTAGGCCA 
AAGGAAAACA 



A AGGATGCAGA 



AACCGCTAAA 
TAAAGGAAAG 
TAAATGTCAT 



GTCACAGGAG 
TGTGTAGATG 



TCATTTCAGA 
GCAATGATGA 
ACAGAAACTG 



TCA7CTGAAA AACAACGTCT 
ACTTTCAGAG AAATTAATAG 
ATTTCATTGG TATAAGGATC 
ATCAAAGGCT GCAAACCTCA 
AGTGAAAGTT CAGGTGGAAA 
TCCAGAAAGC 
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Z ATGGAGAGTG 



C CAGGAGCATC G 



CAGAATGTTT 
TCTTGATTGG 
ATAAAATTAA 
TGCAAAGTGT 



TGGATATCAG 



CAAAACCTCA 
CTCCAGCCCA 
GTTGCTTAAA 
GTCCTCATCA 
TTGCACAAGA 



ACTTATTTAG 
TACCTGTTAT 
CACTACAAGG 
TATATTCTAA 
ATGAATAAAT 
AAAGATTATT 
TTTGCAAGAT 
TTTTTACAGG 
TACTGCCATA 
GAATGTTAA 



AATGGGAAAC 
AGGAGACAAA 
AAGTATCCTC 
TTACTACTGT 
ATCAGCATAG 
CCCTACGCTT 



AGCTACTTGA 
GTCCTGATCA 
GATTACAGAG 
GCAGTCACCT 
GCTCATGAAA 



TTAGACAGGT 



ACCGACGTGA 



GGTTGCCAAA 
GATTCGTGTT 
GCTTTTTAAA 
GGATACTAAT 
ATAAGTTTAT 
AAAAACTAAT 



TTGCTCACGG 
AT CATGATGT 
TTGAGACTAG 
AATGTAGATC 
CCCAGAGAGA 
CCCTGCACTG 
CACTTCAACA 
TCACTCTTTC 
GTGTGTAGTT 
TCCAGCATTC 
GTATGTCACA 
AATACAATGT 



TCATGCCAGT 
'GACTCACATA 
TGTCGTTGTA 
CTCTGAAGAG 
ACAATGCTGT 
GACATGTGAG 



CCCCACCTGT 
CAATTTGTCT 
GCA7TATGTC 
CATCATTTTC 
GATACTACAA 
AAAAAAGGAT 
GAAGCCTGAA 
CAACTTCTAA 



CAGGCTATAC 
GACCAAACTT 
ATAGTTACAT 
TGGAATAGTA 
AAGTTGATTC 



TGCTGGTTGT 
GCTGCTGACT 
GCACTTTACT 



AAGAGGTGAA 
TTATGCATGT 
TCTCCTCTTT 
GATGACTGGA 
CACTTTATCA 



CAGATACAAC CTTAATCTTA 
GTGTTTATGG TTTGCTTATT 
GCCTTTATGT TTTGTTTTCT 
TTAATTAAGT GCTAAGTTAC 
GAATACTAGT TTTAAAAGCT 



TTTTCAGAGA 
GTAATATATA 
ACTTTACAGG 
AGCATTGTGT 
ATCTGGCAAG 



2580 
2640 



3180 
3240 
3300 
3360 
3420 

3540 
3600 

3720 
3780 



55 
60 
65 



MCGSALAFFT 
LGPECGWCVQ 
GEVSIQLRPG 
SRDFRLGFGS 
VHRQKISGNI 



TIAGEIESKA 



CSGRGTCVCG 



AAFVCLQNDR 
EDFISGGSRS 
AEANFMLKVH 
YVDKTVSPYI 
DTPEGGFDAM 
NNVYVKSTTM 
ANLNNLWEA 
HVTVTMKKCD 
CFQCDENKCH 
FSCPYHHGNL 



I 

RGPASFLWAA 
ERCDIVSNLI 
PLKKYPVDLY 
SIHPERIHNQ 
LQAAVCESHI 
EKPSLGQLSE 
YQKLISEVKV 
VTGGKNYAII 



CSDYNLDCMP 
GWRKEAKRLL 
KLIDNNINVI 
QVENQVQGIY 
KPIGFNETAK 
CKSHKDQPVC 



QGEDNRCASS 
YPSVIIVIIPT 
NNIEKLNSVG 
PKGYIHVLSL 



FAVQGKOFHW 



ENEINTQVTP 
NDLSRKMAFF 
TENITEFEKA 
ALDEKLAGIV 



SRKPGMEGCR 



GRFCEHCPTC 
SYLRIFFIIF 
AVTYRREKPE 



SGRGVCVCGK 
DRCQCPSAAA 
KQCLHPHHLS 
VLIIRQVILQ 



I 

ATGGAATCCG 
AGAGACATTA 
ATTTCTGCTG 
AACCCAGAGG 
GATGCTCTTT 
GATAAATATG 
GCTATTCAAG 



AGGATTTAAG 
AAAATAAGTT 
ATACTACAGA 
ACTGGTTGAG 



I 



GCCAAAATGA 



TAAAAATGAA 
TAACTCGGGA 
TTTGTTGCTC 
GATTGGTCGT 
GAGTTTTGCT 
TGCACGTGAC 
ATCTTTTGCA 



TTGACAATTG A 



r AAACCTCCAA 



CATTTACAGA 
TTATATGGAG 
CAAACTAACA 
AGCCCAGATT 
ACCTCTAGAT 



ATAGGAACAA 
AGAACATGCC 
AAACTAAACA 
GTGATGTGAA 
CAGAATGCCG 
TAAGAAATTT 
TCAGATGAAA AGAGTTCTGA 
GAATCAAGTC TTCTAGCTAA 
GAGAGTAACC AGAAACAGTG 
GCTGCATCTT 



ACTGTTAACC 
AAACTAGAGA 
TACAGTCAAG 
AGAATTCAAG 
TACTTTCAAA 
CAATTTGAAC 
GAACGTGGAG 
AAAAAGCAGC 
G CCCAAGAAT 
TCCAGAGGAC 



AAAACAGTGT 
CAATTGAAGC 
TGAGATTTGC 



CSCKKIXLGK 
QHCVMSKGQV 
QAILDQCKTS 
WHSNKIKSSS 



GAACAAAGTG 
CTTGAATAAA 
GATGGCAAAC 



TGTCACAAGG 



TTTGGAAGAG 



TGCTTTCAGA 
CA7TTTCCGG 
AGACTACTAA 
GT7ACCGGAA 



GCTTCCCCCA 
TGAATTAAAA 
AAACTGCAAG 
TAATGTCAAA 
AGAAATGCTG 
GGAGGAAAAG 



AGCCAGGTTT 
7TCATTGAGA 
CCTTCTAAAT 



ATTAGAAGAA 
GCAATCTAAG 
GCAGATTCCG 
ACCTGTCTTT 



ACATCTAAAT G 



ACTAAAGAGT ATCAAGAACC AGAGGTTCCA 
AGAAAGTCAG AG7GTA7TAA CCAGAATCCT 
GAGTTAGCCC C-AAAAG7T.Vl TACAGAGCAG 
TCAGTTTCAA AACAGTCACC ACCAATATCA 
TGTAAGACAC CAAGCAGCAA 7ACCTTGGAT 



1380 
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45 
50 
55 
60 



ACTCCACTTC AAAATTTACA 
AAAGGAAGAA TTTATTCCAT 
CAGGTGTTAA ATGAAAAGAA 



AACTCCAGTT GTAAAGAATG 
ACCTGCCTGT TTCCAGCAGC 
GGTTTTAGCA TCTTCTTCAG 
TTTAAAGCAG ATAGGAAGTG 
GCTATAAAAT 



ACTTTCCACC T 



CACAGTGATA AGATCATCCG 
GTAATGGAGT GTGGAAATAT 
CCATGGGAAC GCAAGAGTTA 
CATGGCATTG TTCACAGTGA 
3 ATTTTGGGAT 



A TGGAATGCTA 



TCCAGAGAGA ATGGGAAATC 



T TACATGCCAT 
GAGAAAGATC TTCAAGATGT 
TCCATTCCTG AGCTCCTGGC 
ATGGCCAAGG GAACCACTGA 
TCTCCTAACT CCATTTTGAA 
T CTTCATCCTC 



GACTTACGGG AAAACACCAT TTCAGCAGAT 



GTTAAAGTGT T 
TCATCCCTAT GTTCAAATTC AAACTCATCC AGTTAACCAA 
AGAAATGAAA TATGTTCTGG GCCAACTTGT TGGTCTGAAT 
AGCTGCTAAA ACTTTATATG AACACTATAG TGGTGGTGAA 
CAAGACTTTT GAAAAAAAAA GGGGAAAAAA ATGA 



I 



NPEDWLSLLL 
AIQEPDDARD 
EIALRNLNLQ 
LYGENMPPQD 
TSRSECRDLV 
ESSLLAKLEE 
KHTTFEQPVF 
LSTPYGQPAC 
QVLNEKKQIY 
VMECGNIDLN 
KLIDFGIAMQ 
GCILYYMTYG 



LTIDSIMNKV 
YPQMARANCK 



TKEYQEPEVP 
SVSKQSPPIS 
FQQQQHQILA 
AIKYVKLEEA 



I 

" RDIKNKFKNE 
DALLNKLIGR 
KFAFVHISFA 
KNLSASTVLT 
QTNKTKQSCP 
SCELRNLKSV 
ESNQKQWQSK 
TSKWFDPKSI 
TPLQNLQVLA 
DNQTLDSYRN 



DLTDELSLNK 
YSQAIEALPP 
QFELSQGNVK 
AQESFSGSLG 



ISADTTDNSG 
DKYGQNESFA 
KSKQLLQKAV 



SHNSSSSKTF E 



MQPDTTSWK 
KTPFQQIINQ 
VQIQTHPVNQ 



RKSECINQNP 
CKTPSSNTLD 
SSSANECISV 
EIAYLNKLQQ 
MLEAVHTIHQ 
PPEAIKDMSS 
NHEIEFPDIP 
YVLGQLVGLN 



KGRIYSILKQ 
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I 

~ TVNQIMMMAN 
RIQVRFAELK 
ERGAVPLEML 
SRGQTTKARF 
SWPCFMKRQ 
TDSITLKNKT 
ELARKVNTEQ 
VKNDFPPACQ 
IGSGGSSKVF 



HGIVHSDLKP 
SRENGKSKSK 
EKDLQDVLKC 
SPNSILKAAK 



3 YEITDQYIYM 600 

ANFLIVDGML 660 

ISPKSDVWSL 72 0 

CLKRDPKQRI 780 



CCGCAGAGGA 



CGAGTGGAGC 
GGGTCCGGCC 
ATGCCTCTGC 
GGGAACGCGG 



GCCTCGGCCA 
GGAGGACCCG 



GGCTAGCCAG 
AGCGGCTGAG 



TGTGAAGCTA 
AGATGCTTTC 
AAACCCCGGC 
CTCAGTGGCC 
ATAAACTGTC 
TCAGGACTCC 
GGTAAAGTCA 
AAATGTCACA 
AATGAATGTA 
GGGTCCTTCA 



CATCCGAACC 



GCATCACGGG 
GGCCTGCTGC 
TGGATGTAAG 



GGCGCCCCCA GCCCCTCCCC AGGCCGCGAG 60 

CCAGACTGCA GGGACAGCAC CCGGTAACTG 12 0 

GAGAGAGGAG GCGGCGGCTT AGCTGCTACG 180 

TCAGGAGGAG GAAGGAGGAC CCGTGCGAGA 2 40 

CTGCTGCTCT CCTGGGTGGC AGGTGGTTTC 300 

TTGTTAGCAT CGGCACGTCA GCCTGGGGTC 360 

TACGGCTGGA GAAGAAACAG CAAGGGAGTC 420 

TTTGGTGAGT GCGTGGGACC AAACAAATGC 480 



CATGCCAACA 
ACATGCTCAT 
AGTACAGCTG 



TTGGTTTCGA 
CTATGGATAG 
AGTGTAAATG 



AAGAAGTTGC 
CCAGAACCCA 
ATAGTTTCCA 
GAGGGGCTTG 



TTGCTCACAA 



CTGGTCCAAA 
GACTGCAGCT 
TGGAATCCTG 
GGTCACAAGA 



GAGGCGGGAA 
AGGATGAGAA 
GAGATGTGTT 
GGAAAGCGCT 
TCAATCATGG 
CTGATCGAGA 



AAATGGAAGA 
CAATCGAAGA 
ACTGCAATAT 
CCATACGTGC 
CAAGCAGGGA 
GGAAGTCCTC 
AAACAGCATG 
TACCCCTAAG 
CTCTCATGGA 
AAGAGAAGAG 
TTTCCCTAAG 



TGCAGTCAAG 
AATACACACG 
ACGTGTGTGA 
GAAGAAGGGC 
GACTGTCTAG 
TGTGTGAACA 
ATCAGTGGAC 
AGCCACCATG 
TATAAAGGCA 
AGAGCACCTG 
AAAAAGAAGG 



ATGTGAATGA 
GAAGCTACAA 
ACTCTAGGAC 



GTGAATGAAG 



CCAATTGCTT 
ATGGACT7CG 
GTACCATCAA 
CAAAAATTAA 
AGCCCTTCAA 
GGAATGAAGA 
AGAATGACAT 
CAGGTGAATT 



GTGCTTTTGC 
ATGTGCCATG 
GTGTCCATCC 
ATGTGCCTCT 
CTACTACTGC 
TATAGATATA 
CAATACCCAA 
GTGTTCTGCT 



AAATGTTACC 
CTATGAAGAG 
GAAAATGAAA 
AGAGGAGGGA 
CGGCCTGATT 



GATCTGTGAC 
TTACCGGCTG 



TCAGGCTTAT GTCCAGATAG CCTTTTATCT 



TTAGAATTAC 
TCTTGTATAA 
TTTCTGAATC 
CAGTATATCT 
TAGAAAAAAA 
TATGACATCA 



TAGCTGAAAA 
GATATGCCAA 
TTTCCACATT 



ATTGTAATGT 
TATTTGCTTT 
ATATTATAAA 
AGTAAGTTGA 
AATGTTTAAC 
TTTGCCTAAG 



GCCGGAGACA 
TGGGAGAAGA CCACGAGTGA GGATGAAAAG 
GGAACTGATG 
GAAAT CGCAG 
GTGGATGACT 
TTGATATTGC 



AAATATCATA 
ATATGGAAAT 
TGAGCTTCTC 
TGTTTGACTC 
TGGCTTAGCT 



GTCAGTTTAT C 
TCTACAACAT TTCTAGAAAA 
TTA7GATACT TCTTGGAAAC 
GGGTCTTTCA TAGCCAAACT 
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MPLPWSLALP LLLSWVAGGF GNAASARHHG LLASARQPGV CHYGTKLACC YGWRRNSKGV 
C'EATCEPGCK FGECVGPNKC RCFPGYTGKT CSQDVNSCGM KPRPCQ'HRCV HTy SY ZF 
LSGHMLMPDA TCVNSRTCAM INCQYSCEDT EEGPQCLCPS SGLRLAPNGR DCLDIDECAS 
GKVICPYNRR CVWTFGSYYC KCHIGFELQY ISGRYDCIDI NECTMDSHTC SHHANCFNTQ 
GSFKCKCKQG YKGNGLRCSA IPENSVKEVL RAPGTIKDRI KKLLAHKNSK KKKAKIKNVT 
PEPTRTPTPK VNLQPFNYEE IVSRGGNSHG GKKGNEEKMK EGLEDEKSEE KALKNDIEER 
SLRGDVFFPK VHEAGEFGLI LVQRKALT3K LEHKDLNISV DCSFNHGICD WKQDREDDFD 
WNPADRDNAI GFYMAVPALA GHKKDIGRLK LLLPDLQPQS NFCLLFDYRL AGDKVGKLRV 
FVKHSNNALA WEKTTSEDEK WKTGKIQLYQ GTDATKSIIF EAERGKGKTG E 
SGLCPDSLLS VDD 

Seq ID NO: 650 DNA sequence 



I I I I I I 

GCAGCTCCAG TCCCGGACGC AACCCCGGAG CCGTCTCAGG TCCCTGGGGG GAACGGTGGG 
TTAGACGGGG ACGGGAAGGG ACAGCGGCCT TCGACCGCCC CCCGAGTAAT TGACCCAGGA 
CTCATTTTCA GGAAAGCCTG AAAATGAGTA AAATAGTGAA ATGAGGAATT TGAACATTTT 
ATCTTTGGAT GGGGATCTTC TGAGGATGCA AAGAGTGATT CATCCAAGCC ATGTGGTAAA 
ATCAGGAATT TGAAGAAAAT GGAGATGTTT ACATTTTTGT TGACGTGTAT TTTTCTACCC 
CTCCTAAGAG GGCACAGTCT CTTCACCTGT GAACCAATTA CTGTTCCCAG ATGTATGAAA 
ATGGCCTACA ACATGACGTT TTTCCCTAAT CTGATGGGTC ATTATGACCA GAGTATTGCC 
GCGGTGGAAA TGGAGCATTT TCTTCCTCTC GCAAATCTGG AATGTTCACC AAACATTGAA 
ACTTTCCTCT GCAAAGCATT TGTACCAACC TGCATAGAAC AAATTCATGT GGTTCCACCT 
TGTCGTAAAC TTTGTGAGAA AGTATATTCT GATTGCAAAA AATTAATTGA CACTTTTGGG 
ATCCGATGGC CTGAGGAGCT TGAATGTGAC AGATTACAAT ACTGTGATGA GACTGTTCCT 
GTAACTTTTG ATCCACACAC AGAATTTCTT GGTCCTCAGA AGAAAACAGA ACAAGTCCAA 
AGAGACATTG GATTTTGGTG TCCAAGGCAT CTTAAGACTT CTGGGGGACA AGGATATAAG 
TTTCTGGGAA TTGACCAGTG TGCGCCTCCA TGCCCCAACA TGTATTTTAA AAGTGATGAG 
CTAGAGTTTG CAAAAAGTTT TATTGGAACA GTTTCAATAT TTTGTCTTTG 7GCAACTCTG 
[■GATGTT AGAAGATTCA GATACCCAGA GAGACCAATT 
T CTGTCTGTTA CAGCATTGTA TCTCTTATGT ACTTCATTGG ATTTTTGCTG 
A T AAGG CAGAT GAGAAGCTAG AACTTGGTGA CACTGTTGTC 
:accgtt TTGTTCATGC TTTTGTATTT TTTCACAA7G 
3 TGTGGTGGGT GATTCTTACC ATTACTTGGT TCTTAGCTGC AGGAAGAAAA 
TGGAGTTGTG AAGCCATCGA GCAAAAAGCA GTGTGGTTTC ATGCTGTTGC ATGGGGAACA 
CCAGGTTTCC TGACTGTTAT GCTTCTTGCT CTGAACAAAG TTGAAGGAGA CAACATTAGT 
GGAGTTTGCT TTGTTGGCCT TTATGACCTG GATGCTTCTC GCTACTTTGT ACTCTTGCCA 
CTGTGCCTTT GTGTGTTTGT TGGGCTCTCT CTTCTTTTAG CTGGCATTAT 7TCC7TAAAT 
CATGTTCGAC AAGT CAT AC A ACATGATGGC CGGAACCAAG AAAAACTAAA GAAA7TTA7G 
ATTCGAATTG GAGTCTTCAG CGGCTTGTAT CTTGTGCCAT TAGTGACACT TCTCGGATGT 
TACGTCTATG AGCAAGTGAA CAGGATTACC TGGGAGATAA CTTGGGTCTC TGATCATTGT 
CGTCAGTACC ATATCCCATG TCCTTATCAG GCAAAAGCAA AAGCTCGACC AGAATTGGCT 
TTATTTATGA TAAAATACCT GATGACATTA ATTGTTGGCA TCTCTGCTGT CTTCTGGGTT 1740 
GGAAGCAAAA AGACATGCAC AGAATGGGCT GGGTTTTTTA AACGAAATCG CAAGAGAGAT 
CCAATCAGTG AAAGTCGAAG AGTACTACAG GAATCATGTG AGTTTTTCTT AAAGCACAAT 
TCTAAAGTTA AACACAAAAA GAAGCACTAT AAACCAAGTT CACACAAGCT GAAGGTCATT 
TCCAAATCCA TGGGAACCAG CACAGGAGCT ACAGCAAATC ATGGCACTTC TGCAGTAGCA 
ATTACTAGCC ATGATTACCT AGGACAAGAA ACTTTGACAG AAATCCAAAC CTCACCAGAA ^U4U 
ACATCAATGA GAGAGGTGAA AGCGGACGGA GCTAGCACCC CCAGGTTAAG AGAA CAGG AC 2100 
TGTGGTGAAC CTGCCTCGCC AGCAGCATCC ATCTCCAGAC T CT CTGGGGA ACAGGTCGAC 2160 
GGGAAGGGCC AGGCAGGCAG TGTATCTGAA AGTGCGCGGA GTGAAGGAAG GATTAGTCCA 2220 
AAGAGTGATA TTACTGACAC TGGCCTGGCA CAGAGCAACA ATTTGCAGGT CCCCAGTTCT 2 2 BO 
TCAGAACCAA GCAGCCTCAA AGGTTCCACA TCTCTGCTTG TTCACCCAGT TTCAGGAGTG 2 340 
AGAAAAGAGC AGGGAGGTGG TTGTCATTCA GATACTTGAA GAACAT7TTC TCTCGTTACT 2400 
CAGAAGCAAA TTTGTGTTAC A CTGG AAGTG ACCTATGCAC TGTTTTGTAA GAATCACTGT 2 460 
TACGTTCTTC TTTTGCACTT AAAGTTGCAT TGCCTACTGT TATACTGGAA AAAATAGAGT 2520 
TCAAGAATAA TATGACTCAT TTCACACAAA GGTTAATGAC AACAATATAC CTGAAAACAG 2580 
AAATGTGCAG GTTAATAATA TTTTTTTAAT AGTGTGGGAG GACAGAGTTA GAGGAATCTT 2640 
CCTTTTCTAT TTATGAAGAT TCTACTCTTG GTAAGAGTAT TTTAAGATGT ACTATGCTAT 2 700 
TTTACCTTTT TGATATAAAA TCAAGATATT TCTTTGCTGA AGTATTTAAA TCTTATCCTT 2 760 
GTATCTTTTT ATACATATTT GAAAATAAGC TTATATGTAT TTGAAC7TTT TTGAAATCCT 2820 
ATTCAAGTAT TTTTATCATG CTATTGTGAT ATTTTAGCAC TTTGGTAGCT TTTACACTGA 2 880 
ATTTCTAAGA AAATTGTAAA ATAGTCTTCT TTTATACTGT AAAAAAAGAT ATACCAAAAA 2 940 
GTCTTATAAT AGGAATTTAA CTTTAAAAAC CCACTTATTG ATACCT7ACC ATCTAAAATG 3 000 
TGTGATTTTT ATAGTCTCGT TTTAGGAATT TCACAGATCT AAATTA7GTA ACTGAAATAA 3 060 
GGTGCTTACT CAAAGAGTGT CCACTATTGA TTGTATTATG CTGCTCACTG ATCCTTCTGC 3120 
ATATTTAAAA TAAAATGTCC TAAAGGGTTA GTAGACAAAA TGTTAG7CTT 7TGTATATTA 3180 
GGCCAAGTGC AATTGACTTC CCTTTTTTAA TGTTTCATGA CCACCCATTG ATTGTATTAT 3240 
AACCACTTAC AGTTGCTTAT ATTTTTTGTT TTAACTTTTG TTTCTTAACA TTTAGAATAT 3300 
TACATTTTGT ATTATACAGT ACCTTTCTCA GACATTTTGT AG 
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MEMFTFLLTC IFLPLLRGHS 
FLPLANLECS PNIETFLCKA 



LFTCEPITVP 
FVPTCIEQIH 
TEFLGPQKKT 
FIGTVSIFCL 



FFPNLMGHYD 
KVYSDCKKLI 
CPRHLKTSGG 



QSIAAVEMEH 
QGYKFLGIDQ 
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DTWLGSQNK ACTVLFMLLY 



I EQKAVWFHAV 

[ ISLNHVRQVI 
SGLYLVPLVT LLGCYVYEQV NRITWEITWV SDHCRQYHIP 
U RKRDPISESR 



! LKVISKSMGT 
KADGASTPRL REQDCGEPAS 
TGLAQSNNLQ VPSSSEPSSL 



STGATANHGT 
PAASISRLSG 
KGSTSLLVHP 



MLLALNKVEG DNISGVCFVG 360 

QHDGRHQEKL KKFMIRIGVF 420 

CPYQAKAKAR PELALFMIKY 480 

RVLQESCEFF LKHHSKVKHK 540 

LGQETLTEIQ TSPETSMREV 600 

RISPKSDITD 660 



I 

TTGGCGGGCG 
GCCGCGTCTC 
TCCGCCCCTC 
ATGATGAACT 
AGGTCAAACT 
AAAACACACT 
TGAGACATCA 
TGGTTCTTGA 
TGTCAGAAGA 
ACAGCCAGGG 



652 DNA sequence 
: Accession jf: NM_014791.1 
171. .212S 



CAACCCGGCG 
AGGCCCCTGT 
TCTAATTCCA 
TATGAATTAC 
ATCCTTACTG 
TTGCCCCGGA 
CAACTCTACC 



I 

" GAAGCGGCCA 
TCAGGACAGC 
AGGTTCTTTT 
TCTCAAATAT 
TGCCTGCCAT 
AGGGAGTGAT 
GCATATATGT 
GTACTGCCCT 



CCTTCTGTCG 

ATGAAACTAT 
GAGAGATGGT 
TCAAAACGGA 
ATGTGCTAGA 
TGTTTGACTA 



TTCTTAGGAA 
CAAGAGGACT 



k AGCCAGAAAA 



ATCTACAGAC 
CATATCTTGG 
GTGGATTTCT 
GAAAATATGA 



TGCAAGATTA 
ATGATTGCGT 
TAATTTCACT 
AGGCTCGGGG AAAACCAGTT 
CTACCCCATT CACAGACATC 
ATAAAAATTA TGTGGCGGGA 
GTGCTGCTAC TCCCCGAACA 
AATCTAAATC ATTAACTCCA 
AAAATGTATA TACTCCTAAG 
CAAAGACTCC 
ACACTACACC 



CTATGCTCAC 
GCTGATTGAC 
ATGCTGTGGG 
ATCAGAGGCA 
ACCATTTGAT 
TGTTCCCAAG TGGCTCTCTC CCAGTAGCAT 
CCCAAAGAAA CGGATTTCTA 
CAACTATCCT 
AACAGAACTT 



3 AGTCTGGCTT 
A GATGTTTGGA 
T GATGATAATG 



ATGCAGCACC 
GCATGGGCAT 
TAATGGCTTT 



AGCTATAAAA 
GATTGAGGCC 
GACAGCCAAC 
TATAATTTCC 
ATCTGCTGTT 
TTTGCTGTTT 
CAAGGGTAAC 
TGAGTTAATA 
ACTGTTATAT 
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I 

' CGCCGTACCA 
AGCCGTGCCC 
ATGAAAGATT 
GGCTTTGCAA 
ATCATGGATA 
TTGAAGAACC 



AAGGATTACC 
CAAGGCAAAT 
GTTCTTATGT 



TCTGCTTCTT CAACAAATGC 
ATTGAACCAT CCCTGGATCA 
TCCTTTTATT CACCTCGATG 
CAGGCAAACA ATGGAGGATT 
TCTTCTGCTT CTAGCCAAGA 



TGAAAAATCT 
AAAGCAAGAA 
ACAGAAACAA 
CGGCTACCTA 



TTAATAGACT 
TCACAGTTTA 
GCCTTATGCA 
TCTGCTGTAA 



3 ACCG CAAGTG 



CTCAAAAGCT 



AAGTTAATGA CAGGTGTCAT 



GCAAAAGGAA 



TGGATAAGGT 



GACAGAATCA AATGGGGTGG 
AAATAAATTA AAGAACAAAG 
GTACTTTATG TTTCCTGAGC 
ACTCACTACG CCAAATCGTT 
AACTCCAATT AAAATACCAG 
AGGCGGTGCC 
AGAAAGGGAG 



CCAGATCAAC 



TTGGGAAAGT GACAATGCAA T 
TGGGTAT CAG 
ACATCCTATC 
GGTGTGATAC 
CTACCAACTT 



TAATCATGTG 
AATGTAAGCT 
TTGTGAATAT 



AGCCTACATA 
GTTTCTAAAG 
TGTGTATGAA 
GTTTTGTATA 
CTTAACTATG 



AAGACTGTTA 
AGCTATCTTA 
TCTAAATCAA 
TTAATAATTG 
TCTCTTTGTA 



TGTTGAATGA 
ATACACTGAA 
AAGTGTGCCA 
ATGCCTGGGT 
TGGATTCTTC 



AGACCAATAT 



AAAGC7TCAC 
AATAATGTCT 
GTGTCAAACA 
GCTICAAAAA 
TTACAAAAGA 
CATCCTGCCG 
GATTTTAAAG 
CTCTTTGTTT 
CATTATGTTA 
TAGATTCACT 
TCTTTCTGAA 



TATAATG7GA 
ATTCTTCCAA 
CAGTCAGATT 
CCCGATGTGG 
TTAGTGGAAG 
GATGAGTGTG 
TTCATTGGAA 
TTAAACAAAA 
CTGTCTTTTT 
TCCATATGTG 
ATAAAACCAT 



2280 
2340 
2400 
24G0 



I 



I 



I 

MKDYDELLKY 
LKNLRHQHIC QLYHVLETAN KIFMVLEYCP 
AYVHSQGYAH RDLKPENLLF DEYHKLKLID 
QGKSYLGSEA DVWSMGILLY VLMCGFLPFD 
QQMLQVDPKK RISMKNLLNH PWIMQDYNYP 
MEDLISLWQY DHLTATYLLL LAKKARGKPV 



ILTGEMVAIK 
GGELFDYIIS 
FGLCAKPKGN 



KNKENVYTPK SAVKNEEYFM 
KIPVNSTGTD KLMTGVISPE RRCRSVELDL 
LTRSKRKGSA RDGPRRLKLH YNVTTTRLVK 
QSDFGKVTMQ FELEVCQLQK PDWGIRRQR 



VEWQSKNPFI 
RLRLSSFSCG 
SQFTKYWTES 
NQHKREILTT 
NQAHMEETPK 
PDQLLNEIMS 



41 
I 

IMDKNTLGSD 
QDRLSEEETR 
KDYHLQTCCG 
IMRGKYDVPK 
HLDDDCVTEL 
QASATPFTDI 



WFRQIVSAV 
SLAYAAPELI 
WLSPSSILLL 



RKGAKVFGSL 
ILPKKHVDFV 
LVSDILSSCK 



ERGLDKVITV 540 
QKGYTLKCQT 600 
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GGCATCACCT GTGCCATACC AGTTAAACAG GCTGATTCTG GAAGTTCTGA GGAAAAGCAG 180 

CTTTACAACA AATACCCAGA TGCTGTGGCC ACATGGCTAA ACCCTGACCC ATCTCAGAAG 240 

CAGAATCTCC TAGCCCCACA GACCCTTCCA AGTAAGTCCA ACGAAAGCCA TGACCACATG 300 

GATGATATGG ATGATGAAGA TGATGATGAC CATGTGGACA GCCAGGACTC CATTGACTCG 3 SO 

5 AACG ACT CTG ATGATGTAGA TGACACTGAT GATTCTCACC AGTCTGATGA GTCTCACCAT 420 

TCTGATGAAT CTGATGAACT GG TCACTGAT TTTCCCACGG ACCTGCCAC-C AACCGAAGTT 480 

TTCACTCCAG TTGTCCCCAC AGTAGACACA TATGATGGCC GAGGTGATAG TGTGGTTTAT 540 

GGACTGAGGT CAAAATCTAA GAAGTTTCGC AGACCTGACA TCCAGTACCC TGATGCTACA 600 

GACGAGGACA TCACCTCACA CATGGAAAGC GAGGAGTTGA ATGGTGCATA CAAGGCCATC 660 

10 CCCGTTGCCC AGGACCTGAA CGCGCCTTCT GATTGGGACA GCCGTGGGAA GGACAGTTAT 72 0 

GAAACGAGTC AGCTGGATGA CCAGAGTGCT GAAACCCACA GCCACAAGCA GTCCAGATTA 780 

TATAAGCGGA AAGCCAATGA TGAGAGCAAT GAGCATTCCG ATGTGATTGA TAGTCAGGAA 840 

CTTTCCAAAG TCAGCCGTGA ATTCCACAGC CATGAATTTC ACAGCCATGA AGATATGCTG 900 

GTTGTAGACC CCAAAAGTAA GGAAGAAGAT AAACACCTGA AATTTCGTAT TTCTCATGAA 960 

15 TTAGATAGTG CATCTTCTGA GGTCAATTAA AAGGAGAAAA AATACAATTT CTCACTTTGC 1020 

ATTTAGTCAA AAGAAAAAAT GCTTTATAGC AAAATGAAAG AGAACATGAA ATGCTTCTTT 1080 

CTCAGTTTAT TGGTTGAATG TGTATCTATT TGAGTCTGGA AATAACTAAT GTGTTTGATA 1140 

ATTAGTTTAG TTTGTGGCTT CATGGAAACT CCCTGTAAAC TAAAAGCTTC AGGGTTATGT 1200 

CTATGTTCAT TCTATAGAAG AAATGCAAAC TATCACTGTA TTTTAATATT TGTTATTCTC 1260 

20 TCATGAATAG AAATTTATGT AGAAGCAAAC AAAATACTTT TACCCACTTA AAAAGAGAAT 1320 

ATAACATTTT ATGTCACTAT AATCTTTTGT TTTTTAAGTT AGTGTATATT TTGTTGTGAT 1380 

TATCTTTTTG TGGTGTGAAT AAATCTTTTA TCTTGAATGT AATAAGAATT TGGTGGTGTC 1440 

AATTGCTTAT TTGTTTTCCC ACGGTTGTCC AGCAATTAAT AAAACATAAC CTTTTTTACT 1500 
GCCTAAAAAA AAAAAAAAAA AAAA 



Seq ID NO: 



25 



1 11 21 31 41 51 

30 | | | | I I 

MRIAVICFCL LGITCAIPVK QADSGSSEEK OLYNKYPDAV ATWLNPDPSQ KQNLLAPQTL 
PSKSNESHDH MDDMDDEDDD DHVDSQDSID SNDSDDVDDT DDSHQSDESH HSDESDELVT 
DFPTDLPATE VFTPWPTVD TYDGRGDSW YGLRSKSKKF RRPDIQYPDA TDEDITSHME 
SEELNGAYKA IPVAQDLNAP SDHDSRGKDS YETSQLDDQS AETHSHKQSR LYKRKANDES 
35 NEHSDVIDSQ ELSKVSREFH SHEFHSHEDM LWDPKSKEE DKHLKFRISH ELDSASEEVN 



Seq ID NO: 656 DNA sequence 

Nucleic Acid Accession #: NM_003108.1 

Coding sequence-. 76.. 1401 



I I i I 

GGGGTGGGAG GGGGAGGGGG ACCTCCGCAC G" " 



45 GCCCTGCAAC GGATCATGGT GCAGCAGGCG GAGAGCT7GG AAGCGGAGAG CAACCTGCCC 12 0 

CGGGAGGCGC TGGACACGGA GGAGGGCGAA TTCATGGCTT GCAGCCCGGT GGCCCTGGAC 180 

GAGAGCGACC CAGACTGGTG CAAGACGGCG TCGGGCCACA TCAAGCGGCC GATGAACGCG 240 

TTCATGGTAT GGTCCAAGAT CGAACGCAGG AAGATCATGG AGCAGTCTCC GGACATGCAC 3 00 

AACGCCGAGA TCTCCAAGAG GCTGGGCAAG CGCTGGAAAA TGCTGAAGGA CAGCGAGAAG 3 GO 

50 ATCCCGTTCA TCCGGGAGGC GGAGCGGCTG CGGCTCAAGC ACATGGCCGA CTACCCCGAC 42 0 

TACAAGTACC GGCCCCGGAA AAAGCCCAAA ATGGACCCCT CGGCCAAGCC CAGCGCCAGC 480 

CAGAGCCCAG AGAAGAGCGC GGCCGGCGGC GGCGGCGGGA GCGCGGGCGG AGGCGCGGGC 540 

GGTGCCAAGA CCTCCAAGGG CTCCAGCAAG AAATGCGGCA AGCTCAAGGC CCCCGCGGCC 600 

GCGGGCGCCA AGGCGGGCGC GGGCAAGGCG GCCCAGTCCG GGGACTACGG GGGCGCGGGC 660 

55 GACGACTACG TGCTGGGCAG CCTGCGCGTG AGCGGCTCGG GCGGCGGCGG CGCGGGCAAG 720 

ACGGTCAAGT GCGTGTTTCT GGATGAGGAC GACGACGACG ACGACGACGA CGACGAGCTG 780 

CAGCTGCAGA TCAAACAGGA GCCGGACGAG GAGGACGAGG AACCACCGCA CCAGCAGCTC 840 

CTGCAGCCGC CGGGGCAGCA GCCGTCGCAG CTGCTGAGAC GCTACAACGT CGCCAAAGTG 900 

CCCGCCAGCC CTACGCTGAG CAGCTCGGCG GAGTCCCCCG AGGGAGCGAG CCTCTACGAC 960 

GAGGTGCGGG CCGGCGCGAC CTCGGGCGCC GGGGGCGGCA GCCGCCTCTA CTACAGCTTC 1020 

AAGAACATCA CCAAGCAGCA CCCGCCGCCG CTCGCGCAGC CCGCGCTGTC GCCCGCGTCC 1080 

TCGCGCTCGG TGTCCACCTC CTCGTCCAGC AGCAGCGGCA GCAGCAGCGG CAGCAGCGGC 1140 

GAGGACGCCG ACGACCTGAT GTTCGACCTG AGCTTGAATT TCTCTCAAAG CGCGCACAGC 120 0 

GCCAGCGAGC AGCAGCTGGG GGGCGGCGCG GCGGCCGGGA ACCTGTCCCT GTCGCTGGTG 1260 

65 GATAAGGATT TGGATTCGTT CAGCGAGGGC AGCCTGGGCT CCCACTTCGA GT7CCCCGAC 1320 

TACTGCACGC CGGAGCTGAG CGAGATGATC GCGGGGGACT GGCTGGAGGC GAACTTCTCC 1380 

GACCTGGTGT TCACATATTG AAAGGCGCCC GCTGCTCGCT CTTTCTCTCG GAGGGTGCAG 1440 

AGCTGGGTTC CTTGGGAGGA AGTTGTAGTG GTGATGATGA 7GATGATGAT AATGATGATG 1500 

ATGATGGTGG TGTTGATGGT GGCGGTGGTA GGGTGGAGGG GAGAGAAGAA GATGCTGATG 1560 

70 ATATTGATAA GATGTCGTGA CGCAAAGAAA TTGGAAAACA TGATGAAAAT TTTGGTGGAG 1620 

TTAAAGTGAA ATGAGTAGTT TTTAAACATT TTTCCTGTCC TTTTTTTGTC CCCCCTCCCT 1680 

TCCTTTATCG TGTCTCAAGG TAGTTGCATA CCTAGTCTGG AGTTGTGAT7 ATTTTCCCAA 1740 

AAAATGTGTT TTTGTAATTA CTATTTCTTT TTCCTGAAAT TCGTGATTGC AACAAAGGCA 1800 

GAGGGGGCGG CGCGGCGGAG GGGAGGTAGG ACCCGCTCCG GAAGGCGCTG TTTGAAGCTT 1860 

75 GTCGGTCTTT GAAGTCTGGA AGACGTCTGC AGAGGACCCT TTTGGCAGCA CAACTGTTAC 192 0 

TCTAGGGAGT TGGTGGAGAT ATTTTTTTTT CTTAAGAGAA CTTAAAGAAC K3GTGATTTT 1980 
TTTTTAACAA AAAAAGGG 

Seq ID NO: 657 protein sequence 



MVQQAESLEA ESNLPREALD TEEGEFMACS PVALDESDPD NCKTASGHIK RPMNAFMVWS 
KIERRKIMEQ SPDMHNAEIS KRLGKRWKML KDSEKIPFIR SAERLRLKHH ADYPDYKYRP 
RKKPKMDPSA KPSASQSPEK SAAGGGGGSA GGGAGGAKTS KGSSKKCGKL KAPAAAGAKA 
GAGKAAQSGD YGGAGDDYVL GSLRVSGSGG GGAGKTVKCV FLDEDDDDDD DDDELQLQIK 
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QEPDEEDEEP PHQQLLQPPG QQPSQLLRRY NVAKVPASPT LSSSAESPEG ASLYDEVRAG 300 

ATSGAGGGSR LYYSFKNITK QHPPPLAQPA LSPASSRSVS TSSSSSSGSS SGSSGEDADD 360 

LMFDLSLNFS QSAHSASEQQ LGGGAAAGNL SLSLVDKDLD SFSEGSLGSH FEFPDYCTPE 420 
^ LSEMIAGDWL EANFSDLVFT Y 

Seq ID NO: 658 DNA sequence 
Nucleic Acid Accession #: NH_001719 
Coding sequence; 123.. 1418 

10 1 11 21 31 

I I I I I I 

GGGCGCAGCG GGGCCCGTCT GCAGCAAGTG ACCGACGGCC GGGACGGCCG CCTGCCCCCT 60 

CTGCCACCTG GGGCGGTGCG GGCCCGGAGC CCGGAGCCCG GGTAGCGCGT AGAGCCGGCG 120 

CGATGCACGT GCGCTCACTG CGAGCTGCGG CGCCGCACAG CTTCGTGGCG CTCTGGGCAC 180 

15 CCCTGTTCCT GCTGCGCTCC GCCCTGGCCG ACTTCAGCCT GGACAACGAG GTGCACTCGA 240 

GCTTCATCCA CCGGCGCCTC CGCAGCCAGG AGCGGCGGGA GATGCAGCGC GAGATCCTCT 300 

CCATTTTGGG CTTGCCCCAC CGCCCGCGCC CGCACCTCCA GGGCAAGCAC AACTCGGCAC 360 

CCATGTTCAT GCTGGACCTG TACAACGCCA TGGCGGTGGA GGAGGGCGGC GGGCCCGGCG 420 

GCCAGGGCTT CTCCTACCCC TACAAGGCCG TCTTCAGTAC CCAGGGCCCC CCTCTGGCCA 480 

20 GCCTGCAAGA TAGCCATTTC CTCACCGACG CCGACATGGT CATGAGCTTC GTCAACCTCG 540 

TGGAACATGA CAAGGAATTC TTCCACCCAC GCTACCACCA TCGAGAGTTC CGGTTTGATC 600 

TTTCCAAGAT CCCAGAAGGG GAAGCTGTCA CGGCAGCCGA ATTCCGGATC TACAAGGACT 660 

ACATCCGGGA ACGCTTCGAC AATGAGACGT TCCGGATCAG CGTTTATCAG GTGCTCCAGG 72 0 

AGCACTTGGG CAGGGAATCG GATCTCTTCC TGCTCGACAG CCGTACCCTC TGGGCCTCGG 780 

25 AGGAGGGCTG GCTGGTGTTT G AC AT CAC AG CCACCAGCAA CCACTGGGTG GTCAATCCGC 840 

GGCACAACCT GGGCCTGCAG CTCTCGGTGG AGACGCTGGA TGGGCAGAGC ATCAACCCCA 900 

AGTTGGCGGG CCTGATTGGG CGGCACGGGC CCCAGAACAA GCAGCCCTTC ATGGTGGCTT 960 

TCTTCAAGGC CACGGAGGTC CACTTCCGCA GCATCCGGTC CACGGGGAGC AAACAGCGCA 1020 

GCCAGAACCG CTCCAAGACG CCCAAGAACC AGGAAGCCCT GCGGATGGCC AACGTGGCAG 1080 

30 AGAACAGCAG CAGCGACCAG AGGCAGGCCT GTAAGAAGCA CGAGCTGTAT GTCAGCTTCC 1140 

GAGACCTGGG CTGGCAGGAC TGGATCATCG CGCCTGAAGG CTACGCCGCC TACTACTGTG 1200 

AGGGGGAGTG TGCCTTCCCT CTGAACTCCT ACATGAACGC CACCAACCAC GCCATCGTGC 12 60 

AGACGCTGGT CCACTTCATC AACCCGGAAA CGGTGCCCAA GCCCTGC1GT GCGCCCACGC 1320 

AGCTCAATGC CATCTCCGTC CTCTACTTCG ATGACAGCTC CAACGTCATC CTGAAGAAAT 1380 

35 ACAGAAACAT GGTGGTCCGG GCCTGTGGCT GCCACTAGCT CCTCCGAGAA TTCAGACCCT 1440 

TTGGGGCCAA GTTTTTCTGG ATCCTCCATT GCTCGCCTTG GCCAGGAACC AGCAGACCAA 1500 

CTGCCTTTTG TGAGACCTTC CCCTCCCTAT CCCCAACTTT AAAGGTGTGA GAGTATTAGG 1560 

AAACATGAGC AGCATATGGC TTTTGATCAG TTTTTCAGTG GCAGCATCCA ATGAACAAGA 1620 

TCCTACAAGC TGTG CAGGCA AAACCTAGCA GGAAAAAAAA ACAACG CAT A AAGAAAAATG 1680 

40 GCCGGGCCAG GTCATTGGCT GGGAAGTCTC AGCCATGCAC GGACTCGTTT CCAGAGGTAA- 1740 

TTATGAGCGC CTACCAGCCA GGCCACCCAG CCGTGGGAGG AAGGGGGCGT GGCAAGGGGT 1800 

GGGCACATTG GTGTCTGTGC GAAAGGAAAA TTGACCCGGA AGTTCCTGTA ATAAATGTCA 1860 

CAATAAAACG AATGAATG 

45 Seq ID NO i 659 Protein sequence 
Protein Accession #> NP_001710 

1 11 21 31 41 51 

- A I I I I I i 

D(J MHVRSLRAAA PHSFVALWAP LFLLRSALAD FSLDNEVHSS FIHRRLRSQE RREMQREILS 60 

ILGLPHRPRP HLQGKHNSAP MFMLDIiYNAM AVEEGGGPGG QGFSYPYKAV FSTQGPPLAS 120 

LQDSHFLTDA DMVHSFWLV EHDKEFFHPR YHHREFRFDL SKIPEGSAVT AAEFRIYKDY 180 

IRERFDNETF RISVYOVLQE HLGRESDLFL LDSRTLWASE EGWLVFDITA TSNHWWNPR 240 

HNLGLQLSVE TLDGQSINPK LAGLIGFHGP QNKQPFMVAF FKATEVHFRS IRS7GSKQRS 300 

55 QNRSKTPKNQ EALRMANVAE NSSSDQRQAC KKHELYVSFR DLGWQDWIIA PEGYAAYYCE 360 

GECAFPLNSY MNATNHAIVQ TLVHFINPET VPKPCCAPTQ LNAISVLYFD DSSNVILKKY 420 

RNMWRACGC H 

Seq ID NO: 660 DNA sequence 
60 Nucleic Acid Accession #: Eos sequence 
Coding sequence: 2 11.. 1895 

1 11 21 31 41 SI 

« I 1 1 1 1 1 

65 GGATCTGAGG GGCGCCCAGT CACTTCCTCC ACGTTCTCGT GCTGGGCGGG AGGAGCGGAT 60 

GGGGCTTGGG AGGCAGCCTG CTCTCCAGTC CCTATCCACC CACAGGTTTT TTGGGTCGGA 120 

GAGGAATTAT CTGATAAAAT TCCTGGGTTA ATATTTTTAA AAACGGAGAG TTTTTAAAAA 180 

TGATTTTTTT CCCTCGAAAA TGACCTTTTT ATGCTTCGAA GCAGTTTGTC AACCAGCATA 240 

GTGCTTTTTC TTTTCTCTTC TTTTTCTACG ATAAATGAAA GCATTTCTTC AAGAAAAAGG 300 

70 CACAGGTTCC TTGAACAGCT GGATTCTGAT GGCACCATTA CTATAGAGGA GCAGATTGTC 360 

CTTGTGCTGA AAGCGAAAGT ACAATGTGAA CTCAACATCA CAGCTCAACT CCAGGAGGGA 42 0 

GAAGGTAATT GTTTCCCTGA ATGGGATGGA CTCATTTGTT GGCCCAGAGG AACAGTGGGG 480 

AAAATATCGG CTGTTCCATG CCCTCCTTAT ATTTATGACT TCAACCATAA AGGAGTTGCT 540 

TTCCGACACT GTAACCCCAA TGGAACATGG GATTTTATGC ACAGCTTAAA TAAAACATGG 600 

75 GCCAATTATT CAGACTGCCT TCGCTTTCTG CAGCCAGATA TCAGCATAGG AAAGCAAGAA 660 

TTCTTTGAAC GCCTCTATGT AATGTATACC GTTGGCTACT CCATCTCTTT TGGTTCCTTG 72 0 

GCTGTGGCTA TTCTCATCAT TGGTTACTTC AGACC-ATTGC ATTGCACTAG GAACTATATC 780 

CACATGCACT TATTTGTGTC TTTCATGCTG AGAGCTACAA GCATCTTTGT CAAAGACAGA 840 

GTAGTCCATG CTCACATAGG AGTAAAGGAG CTGGAGTCCC TAATAATGCA GGATGACCCA 900 

80 CAAAATTCCA TTGAGGCAAC TTCTGTGGAC AAATCACAAT ATATCGGGTG CAAGATTGCT 960 

GTTGTGATGT TTATTTACTT CCTGGCTACA AATTATTATT GGATCCTGGT C-GAAGGTCTC 1020 

TACCTGCATA ATCTCATCTT TGTGGCTTTC TTTTCGGACA CCAAATACCT GTGGGGCTTC 1080 

ATCTTGATAG GCTGGGGGTT TCCAGCAGCA TTTGTTGCAG CATGGGCTGT GGCACGAGCA 1140 

ACTCTGGCTG ATGCGAGGTG CTGGGAACTT AGTGCTGGAG ACATCAAGTG GATTTATCAA 1200 

85 GCACCGATCT TAGCAGCTAT TGGGCTGAAT TTTATTCTGT TTCTGAATAC GGTTAGAGTT 1260 

CTAGCTACCA AAATCTGGGA GACCAATGCA GTTGGGCATG ACACAAGGAA C-CAATACAGG 1320 

AAACTGGCCA AATCGACACT GGTCCTGGTC CTAGTCTTTG GAGTGCATTA CATCGTGTTC 1380 
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GTATGCCTGC CTCACTCCTT CACTGGGCTC GGGTGGGAGA TCCGCATGCA CTGTGAGCTC 1440 

TTCTTCAACT CCTTTCAGGG TTTCTTTGTG TCTATCATCT ACTGCTACTG CAATGGAGAG 1500 

GTTCAGGCAG AGGTGAAGAA GATGTGGAGT CGGTGGAATC TCTCOGTGGA CTGGAAAAGG 15S0 

ACACCGCCAT GTGGCAGCCG CAGATGCGGC TCAGTGCTCA CCACCGTGAC GCACAGCACC 1620 

5 AGCAGCCAGT CACAGGTGGC GGCCAGCACA CGCATG3TGC 7TATC7CT3C CAAACCTGCC 1S30 

AAGATCGCCA GCAGACAGCC TGACAGCCAC ATCACTT7AC CTGGCTAT3T CTC-GAGTAAC 1740 

TCAGAGCAGG ACTGCCTGCC ACACTCTTTC CACGAGGAGA CCAAGGAAGA TAGTGGGAGG 1800 

CAGGGAGATG ATATTCTAAT GGAGAAGCCT TCCAGGCCTA TGGAATCTAA CCCAGACACT I860 
GAAGGATGCC AAGGAGAAAC TGAGGATGTT CTCTGA 

Seq ID NO: 661 Protein sequence 

15 I r f r i 1 i 1 

MLRSSLSTSI VLFLFSSFST INESISSRKR HRFLEQLDSD GTITIEEQIV LVLKAKVQCE 60 

LNITAQLQEG EGNCFPEWDG LICWPRGTVG KISAVPCPPY IYDFNHKGVA FRHCNPNGTW 120 

DFMHSLNKTW ANYSDCLRFL QPDISIGKQE FFERLYVMYT VGYSISFGSL AVAILIIGYF 180 

RRLHCTRNYI HMHLFVSFML RATSIFVKDR WHAHIGVKE LESLIMQDDP QMSIEATSVD 240 

20 KSQYIGCKIA WMFIYFLAT NYYWILVEGL YLHNLIFVAF FSDTKYLWGF ILIGWGFPAA 300 

FVAAWAVARA TIjADARCWEL SAGDIKWIYQ APILAAIGLN FILFLNTVRV LATKIWETNA 360 

VGHDTRKQYR KLAKSTLVLV LVFGVHYIVF VCLPHSFTGL GWEIRMHCEL FFNSFQGFFV 420 

SIIYCYCNGE VQAEVKKMWS RWNLSVDWKR TPPCGSRRCG SVLTTV-HST SSQSQVAAST 480 

RMVLISGKAA KIASRQPDSH ITLPGYVWSN SEQDCLPHSF KEETKEDSGR QGDDILMEKP 540 



30 



Seq ID NO: 662 DNA^ sequence 

1 11 21 31 41 51 

I I I I I I 

GGCCGGTGGC CCGGGCCCGA CCACCCCAGC TGCGCGTCGT TACTGGCCAC AAGTTTGCTC 60 

TGGGCCAGCC AAGTTGGCAA CTTGGAAGCT TCTCCCGGGC TCTGGAGGAG GGTCCCTGCT 120 

35 TCTTCCTACA GCCGTTCCGG GCATGGCCGG GCTGGGGGCG TCGCTCCACG TCTGGGGTTG 180 

GCTAATGCTC GGCAGCTGCC TCCTGGCCAG AGCCCAGCTG GATTCTGATG GCACCATTAC 2 40 

TATAGAGGAG CAGATTGTCC TTGTGCTGAA .AGCGAAAGTA CAATGTGAAC TCAACATCAC 300 

AGCTCAACTC CAGGAGGGAG AAGGTAATTG TTTCCCTGAA TGGGATGGAC TCATTTGTTG 360 

GCCCAGAGGA ACAGTGGGGA AAATATCGGC TGTTCCATGC CCTCCTTATA TTTATGACTT 420 

40 CAACCATAAA GGAGTTGCTT TCCGACACTG TAACCCCAAT GGAACATGGG ATTTTATGCA 480 

CAGCTTAAAT AAAACATGGG CCAATTATTC AGACTGCCTT CGCTTTCTGC AGCCAGATAT 540 

CAGCATAGGA AAGCAAGAAT TCTTTGAACG CCTCTATGTA ATGTATACCG TTGGCTACTC 600 

CATCTCTTTT GGTTCCTTGG CTGTGGCTAT TCTCATCATT GGTTACTTCA GACGATTGCA 560 

TTGCACTAGG AACTATATCC ACATGCACTT ATTTGTGTCT TTCATGCTGA GAGCTACAAG 72 0 

45 CATCTTTGTC AAAGACAGAG TAGTCCATGC TCACATAGGA GTAAAGGAGC TGGAGTCCCT 780 

AATAATGCAG GATGACCCAC AAAATTCCAT TGAGGCAACT TCTGTGGACA AATCACAATA 840 

TATCGGGTGC AAGATTGCTG TTGTGATGTT TATTTACTTC CTGGCTACAA ATTATTATTG 900 

GATCCTGGTG GAAGGTCTCT ACCTG CAT AA TCTCATCTTT GTGGCTTTCT TTTCGGACAC 960 

CAAATACCTG TGGGGCTTCA TCTTGATAGG CTGGGGGTTT CCAGCAGCAT TTGTTGCAGC 1020 

50 ATGGGCTGTG GCACGAGCAA CTCTGGCTGA TGCGAGGTGC TGGGAACTTA GTGCTGGAGA 1080 

CATCAAGTGG ATTTATCAAG CACCGATCTT AGCAGCTATT GGGCTGAATT TTATTCTGTT 1140 

TCTGAATACG GTTAGAGTTC TAGCTACCAA AATCTGGGAG ACCAATGCAG TTGGGCATGA 1200 

CACAAGGAAG CAATACAGGA AACTGGCCAA ATCGACACTG GTCCTGGTCC TAGTCTTTGG 12 60 

AGTGCATTAC ATCGTGTTCG TATGCCTGCC TCACTCCTTC ACTGGGCTCG GGTGGGAGAT 1320 

55 CCGCATGCAC TGTGAGCTCT TCTTCAACTC CTTTCAGGGT TTCTTTGTGT CTATCATCTA 1380 

CTGCTACTGC AATGGAGAGG TTCAGGCAGA GGTGAAGAAG ATGTGGAGTC GGTGGAATCT 1440 

CTCCGTGGAC TGGAAAAGGA CACCGCCATG TGGCAGCCGC AGATGCGGCT CAGTGCTCAC 1500 

CACCGTGACG CACAGCACCA GCAGCCAGTC ACAGGTGGCG GCCAGCACAC GCATGGTGCT 1560 

TATCTCTGGC AAAGCTGCCA AGATCG CCAG CAGACAGCCT GACAGCCACA TCACTTTACC 1620 

60 TGGCTATGTC TGGAGTAACT CAGAGCAGGA CTGCCTGCCA CACTCTTTCC ACGAGGAGAC 1680 

CAAGGAAGAT AGTGGGAGGC AGGGAGATGA TATTCTAATG GAGAAGCCTT CCAGGCCTAT 1740 

GGAATCTAAC CCAGACACTG AAGGATGCCA AGGAGAAACT GAGGATGTTC TCTGAATGGA 1800 

CATTTGTGGC TGACTTTCAT GGGCTGGTCC AATGGCTGGT TGTGTGAGAG GGCTTGGCTG 1860 

ATACTCCTAT GCTTGAGTTC AAAGGCTGAA AATTCAGTTA AGGTGTTACT TAATAATAGT 1920 

65 TTTTAGGCTC CATGAATTGG CTCCTGTAAA TACTAACGAC ATGAAAATGC AAGTGTCAAT 1980 

GGAGTAGTTT ATTACCTTCT ATTGGCATCA AGTTTTCCTC TAAATTAATG TATGGTATTT 2040 

GCTCTGTGAT TGTTCATTTT TTTCTGCTAC TTTTGGGTAG AAAAAAGATT CAATTGCTTG 2100 

GCTGTAGCTT TCTCTCATAT ATATCACCCT AAATATAATG AAGATCTTTT AGTGTGTATC 2160 

ATTTTCCTTT TAGAAACTAG TATTCTCTTA TTTCTTACTT TAATGTACTT CTATCACTGC 2220 

70 ATTTATTTTG CCTGTGCATA GGAGCAATTA GGATCTAAAA AAATATATGG GAAGATAAAA 2280 

GAT CTAAG AA CAAGTACTTG CTGGAAAATT AGTTGGCTGG ACATTGATAA AATAATGCAT 2340 

TTATAACAAT TACATGTGTT TTTGGGAACA AGGAAAATTT CTCAAAAAAG AATATTTCAC 2400 

ACATCCCTTC TTTTGAATGG CCTCTTTGTG ACCAGCCAGA CCTCAGGTCT TCACTCTTTC 2460 

TTCTTTGTAA ACCATGTCAT GTGGAAAGAT TTCCTCAGTT AGTGAGCTTG TGTCTGCAAA 2520 

75 TTGATTTTGT TTGTAATGTA TTTTGATAGC AAATCATGCT GCATCTATAT CTTTTTCTTG 2580 

T GTGGGATCAA TTA^AAATTT GTTTTAAAAA 2 640 



MAGLGASLHV WGWLMLGSCL LARAQLDSDG TITISEQIVL VLKAKVQCEL NITAQLQEGE 
GNCFPEWDGL ICWPRGTVGK ISAVPCPPYI YDFNHKGVAF RHCNPNGTWD FMHSLNKTWA 
NYSDCLRFLQ PDISIGKQEF FERLYVMYTV GYSISFGSLA VAILIIGYFR RLHCTRMYIH 
MHLFVSFMLR ATSIFVKDRV VHAHIGVKEL ESLIMQDDPQ NSIEATSVDK SQYIGCKIAV 
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VMFIYFLATN YYWILVEGLY LHNLIFVAFF SDTKYLWGFI LIONGFPAAF VAAWAVARAT 300 

LADARCWELS AGDIKWIYQA PILAAIGLNF ILFLNTVRVL ATKIWETNAV GHDTRKQYRK 360 

LAKSTLVLVL VFGVHYIVFV CLPHSFTGLG WEIRMHCELF FNSFQGFFVS IIYCYCNGEV 420 

QAEVKKMWSR WNLSVDWKRT PPCGSRECGS VLTTVTHSTS SQSQVAASTR MVLISGKAAK 480 

IASRQPDSHI TLPGYVWSNS EQDCLPHSFH EETKEDSGRQ GDDILMEKPG RPMESNPDTE 540 
GCQGETEDVL 



I I I I I 

CTTCTTTAAA TTTCTTTCTA GGATGTTCAC TTCTTCTCCA CAATGAATGA G 

GACAAGCACA TGGACTTTTT TTATAATAGG AGCAACACTG ATACTGTCGA TGACTGGACA 120 

GGAACAAAGC TTGTGATTGT TTTGTGTGTT GGGACGTTTT TCTGCCTGTT TATTTTTTTT 180 

TCTAATTCTC TGGTCATCGC GGCAGTGATC AAAAACAGAA AATTTCATTT CCCCTTCTAC 240 

TACCTGTTGG CTAATTTAGC TGCTGCCGAT TTCTTCGCTG GAATTGCCTA TGTATTCCTG 300 

ATGTTTAACA CAGGCCCAGT TTCAAAAACT TTGACTGTCA ACCGCTGGTT TCTCCGTCAG 360 

GGGCTTCTGG ACAGTAGCTT GACTGCTTCC CTCACCAAC7 TGCTGGTTAT CGCCGTGGAG 42 0 

AGGCACATGT CAATCATGAG GATGCGGGTC CATAGCAACC TGACCAAAAA GAGGGTGACA 480 

CTGCTCATTT TGCTTGTCTG GGCCATCGCC ATTTTTATGG GGGCGGTCCC CACACTGGGC 540 

TGGAATTG CC TCTGCAACAT CTCTGCCTGC TCTTCCCTGG CCCCCATTTA CAGCAGGAGT 600 

TACCTTGTTT TCTGGACAGT GTCCAACCTC ATGGCCTTCC TCATCATGGT TGTGGTGTAC 660 

CTGCGGATCT ACGTGTACGT CAAGAGGAAA ACCAACGTCT TGTCTCCGCA 7ACAAGTGGG 72 0 

TCCATCAGCC GCCGGAGGAC ACCCATGAAG CTAATGAAGA CGGTGATGAC TGTCTTAGGG 780 

GCGTTTGTGG TATGCTGGAC CCCGGGCCTG GTGGTTCTGC TCCTCGACGG CCTGAACTGC 840 

AGG CAGTGTG GCGTGCAGCA TGTGAAAAGG TGGTTCCTGC TGCTGGCGCT GCTCAACTCC 900 

GTCGTGAACC CCATCATCTA CTCCTACAAG GACGAGGACA TGTATGGCAC CATGAAGAAG 960 

ATGATCTGCT GCTTCTCTCA GGAGAACCCA GAGAGGCGTC CCTCTCGCAT CCCCTCCACA 102 0 

GTCCTCAGCA GGAGTGACAC AGGCAGCCAG TACATAGAGG ATAGTATTAG CCAAGGTGCA 1080 

GTCTGCAATA AAAGCACTTC CTAAACTCTG GATGCCTCTC GGCCCACCCA GGTGATGACT 1140 
GTCTTAGG 



1 11 21 31 41 51 

I I ■ I I I I 

MNECHYDKHM DFFYNRSNTD TVDDWTGTKL VIVLCVGTFF CLFIFFSNSL VIAAVIKNRK 

FHFPFYYLLA NLAAADFFAG IAYVFLMFNT GPVSKTLTW RWFLRQGLLD SSLTASLTNL 

LVIAVERHMS IMRMRVHSNL TKKRVTLLIL LVWAIAIFMG AVPTLGWNCL CNISACSELA 

PIYSRSYLVF WTVSNLMAFL IMVWYLRIY VYVKRKTNVL SPHTSGSISR RRTPMKLMKT 

VMTVLGAFW CWTPGLWLL LDGLNCRQCG VQHVKRWFLL LALLNSVVNP I 

YGTMKKMICC FSQENPERRP SRIPSTVLSR SDTGSQYIED S I SQGAVCNK STS 

Seq ID NO i 666 DNA sequence 



I I I I I I 

AACTCCCGCC TCGGGACGCC TCGGGGTCGG GCTCCGGCTG CGGCTGCTGC TGCGGCGCCC SO 

GCGCTCCGGT GCGTCCGCCT CCTGTGCCCG CCGCGGAGCA GTCTGCGGCC CGCCGTGCGC 12 0 

CCTCAGCTCC TTTTCCTGAG CCCGCCGCGA TGGGAGCTGC GCGGGGATCC CCGGCCAGAC 180 

CCCGCCGGTT GCCTCTGCTC AGCGTCCTGC TGCTGCCGCT GCTGGGCGGT ACCCAGACAG 240 

CCATTGTCTT CATCAAGCAG CCGTCCTCCC AGGATGCACT GCAGGGGCGC CGGGCGCTGC 3 00 

TTCGCTGTGA GGTTGAGGCT CCGGGCCCGG TACATGTGTA CTGGCTGCTC GATGGGGCCC 360 

CTGTCCAGGA CACGGAGCGG CGTTTCGCCC AGGGCAGCAG CCTGAGCTTT GCAGCTGTGG 42 0 

ACCGGCTGCA GGACTCTGGC ACCTTCCAGT GTGTGGCTCG GGATGATGTC ACTGGAGAAG 480 

AAGCCCGCAG TGCCAACGCC TCCTTCAACA T CAAATGGAT TGAGGCAGGT CCTGTGGTCC 540 

TGAAGCATCC AGCCTCGGAA GCTGAGATCC AGCCACAGAC CCAGGTCACA CTTCGTTGCC 600 

ACATTGATGG GCACCCTCGG CCCACCTACC AATGGTTCCG AGATGGGACC CCCCTTTCTG 660 

ATGGTCAGAG CAACCACACA GTCAGCAGCA AGGAGCGGAA CCTGACGCTC CGGCCAGCTG 720 

GTCCTGAGCA TAGTGGGCTG TATTCCTGCT GCGCCCACAG TGCTTTTGGC CAGGCTTGCA 7 80 

GCAGCCAGAA CTTCACCTTG AGCATTGCTG ATGAAAGCTT TGCCAGGGTG GTGCTGGCAC 840 

CCCAGGACGT GGTAGTAGCG AGGTATGAGG AGGCCATGTT CCATTGCCAG TTCTCAGCCC 900 

AGCCACCCCC GAGCCTGCAG TGGCTCTTTG AGGATGAGAC TCCCATCACT AACCGCAGTC 960 

GCCCCCCACA CCTCCGCAGA GCCACAGTGT TTGCCAACGG G7CTCTGCTG CTGACCCAGG 1020 

TCCGGCCACG CAATGCAGGG ATCTACCGCT GCATTGGCCA GGGGCAGAGG GGCCCACCCA 1080 

TCATCCTGGA AGCCACACTT CACCTAGCAG AGATTGAAGA CATGCCGCTA TTTGAGCCAC 1140 

GGGTGTTTAC AGCTGGCAGC GAGGAGCGTG TGACCTGCCT TCCCCCCAAG GGTCTGCCAG 1200 

AGCCCAGCGT GTGGTGGGAG CACGCGGGAG TCCGGCTGCC CACCCATGGC AGGGTCTACC 12 60 

AGAAGGGCCA CGAGCTGGTG TTGGCCAATA TTGCTGAAAG TGATGCTGGT GTCTACACCT 1320 

GCCACGCGGC CAACCTGGCT GGTCAGCGGA GACAGGATGT CAACATCACT GTGGCCACTG 1380 

TGCCCTCCTG GCTGAAGAAG CCCCAAGACA GCCAGCTGGA GGAGGGCAAA CCCGGCTACT 1440 

TGGATTGCCT GACCCAGGCC ACACCAAAAC CTACAGTTGT CTGGTACAGA AACCAGATGC 1500 

TCATCTCAGA GGACTCACGG TTCGAGGTCT TCAAGAATGG GACCTTGCGC ATCAACAGCG 1560 

TGGAGGTGTA TGATGGGACA TGGTACCGTT GTATGAGCAG CACCCCAGCC GC-CAGCATCG 1620 

AGGCGCAAGC CCGTGTCCAA GTGCTGGAAA AGCTCAAGTT CACACCACCA CCCCAGCCAC 1680 

AGCAGTGCAT GGAGTTTGAC AAGGAGGCCA CGGTGCCCTG TTCAGCCACA GGCCGAGAGA 1740 

AGCCCACTAT TAAGTGGGAA CGGGCAGATG GGAGCAGCCT CCCAGAGTGG GTGACAGACA 1800 

ACGCTGGGAC CCTGCATTTT GCCCGGGTGA CTCGAGATGA CGCTGGCAAC TACACTTGCA 1860 

TTGCCTCCAA CGGGCCGCAG GGCCAGATTC GTGCCCATGT CCAC-CTCACT GTGGCAGTTT 1920 

TTATCACCTT CAAAGTGGAA CCAGAGCGTA CGACTGTGTA CCAGGGCCAC ACAGCCCTAC 1980 

TGCAGTGCGA GGCCCAGGGG GACCCCAAGC CGCTGATTCA GTGGAAAGGC AAGGACCGCA 2 040 

TCCTGGACCC CACCAAGCTG GGACCCAGGA TGCACATCTT CCAGAATGGC TCCCTGGTGA 2100 
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TCCATGACGT GGCCCCTGAG 
ACATCAAGCA CACGGAGGCC 
AGGGCCCTGG CAGCCCTCCC 
CCGCTGTGGC CTACATCATT 
AAGCCAAGCG GCTGCAGAAG 
GAGGGCCTTT GCAGAACGGG 
GCTTGGGCTC CGGCCCCGCG 
TCCCACGGTC TAGCCTGCAG 
TCCTGGCAAA GGCTCAGGGC 
GCCTGCAGAC GAAGGATGAG 
GGAAGCTGAA CCACGCCAAC 
ACTACATGGT GCTGGAATAT 
AGAGCAAGGA TGAAAAATTG 
GCACCCAGGT AGCCCTGGGC 
TGGCTGCGCG TAACTGCCTG 
TCAGCAAGGA TGTGTACAAC 
GCTGGATGTC CCCCGAGGCC 



GACTCAGGCC GCTACACCTG CATTGCAGGC 
GCCTGTGCCG 
CATTGGGTTG 
CTACTGCAAG 



CCCTACAAGA 



CAGCCCGAGG 
CAGCCCTCAG 
GCCACCAACA 
CCCATCACCA 
TTGGAGGAGG 
CAGCAGCAGC 



TGATCCAGAC 
GCCTCATGTT 
GCGAGGAGCC 
CAGAGATCCA 
AACGCCACAG 
CGCTGGGGAA 
GAGTGGCAGA 
TGGACTTCCG 



AGAAGAAGTG 
CACAAGTGAT 
GAGTGAGTTT 
GACCCTGGTA 
iGGGAGTTG 



AACAGCTGCA 
GAGGAGTCGG 
TCGGTGGGTG 
AAGCGCTGCA 
TGCCTCAACG 
GCCTTGACCA 
AAGATGCACT 



CAGATGATGA 
GCTGCCCTTC 
GGCCCTCCTT 
GAGGAGGGAG 
CAGCATGATG 
TTGCTGAGGT 
GGCTGACTTG 
CTCTTCCTCT 

AGGCTTGGGA 



AGTACTGGCA 



AAGTCACAGC CCCTCAGCAC 
ATGGAGCACC 
GTCAGTGCCC 
AGTGAGTACT ACCACTTCCG 
ATCCTGGAGG 
GAAGTGTTTA 



GCAGTTCCTG 
CAAGCAGAAG 
CCGCTTTGTG 



CGGCTGATGC AGCGCTGCTG GGCCCTCAGC 
GCCAGCGCCC TGGGAGACAG CACCGTGGAC 
ATGGCCTGGG CAGGGGAGGA CATCTCTAGA 
CTGTCCTCCT GGGCCCTGAG GTGCCCTAGT 



ACACAGCAAG 
CCCCACCCTT 
CTTTTGACAC 
TGCAGCGTGG 
MCCTTA 



GACCCAAACT 
AT CAGGG ACA 
GACCGGGTCC 
TGAGCTGGGT 
AGTCTCTTGC 
TGAGTCCTCC 
CTCTCCTTTC 
TATATAAACC 



GGGCGACTAG 
GTGTGGGTGC 
AACTCTGCCA 



GGCTTTGAGC 
CACAGGTAAC 
CTCATCTGCC 
TTCCTTAATA 
ACTTGGGGGT 
CTTGTGCACA 



CCCAATTTCT 
AACTTTGCCT 
TTCTCAAGTT 
CTAGACCAGG 
CTGACCCAGA 
GATGAAGGAG 



= CCACTGGTCC 
C CCACTCTGGG 
C CTCATCCTAA 
C GCCCTTTTTG 

3 CATGGGAGGT AGGGGTGGGC CCTGGAGATG 
T TTATTGTTGT 

TGTTTTTGTT TTTACACTCG CTGCTCTCAA TAAATAAGCC TTTTTTA 



2160 
2220 
2280 
2340 
2400 
2460 
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GTGGTGCGGC TCCTGGGGCT GTGCCGGGAG GCTGAGCCCC 2760 



AGGATTTCCA 
GTGGCCCTAT 
CATAAGGACT 
GCCCTGGGCC 
GTGCCGCTGC 
GATGTCTGGG 



CAGCCCGAGG 
CCCAAGGACC 
AGCAAGCCGT 



TCCTTTGGGA 



GGCCTTCAAC 
GGGGAGGGCT 
CTGGGCACAC 
ATTATAGAGG 
CCCACGTCTT 
TTTTCAGGAG 
TATATGTAAT 
AGGAGGGTGG 



I I 
MGAARGSPAR PRRLPLLSVL 
VHVYWLLDGA PVQDTERRFA 
IKWIEAGPW LKHPASEAEI 
KERNLTLRPA GPEIISGLYSC 



LLPLLGGTQT 
QGSSLSFAAV 
QPQTQVTLRC 



I 

- aivpikqpss q: 



I 



CIGQGQRGPP I ILEATLHLA 
VRLPTIIGRVY QKGIIELVLAN 
SQLEEGKPGY LDCLTQATPK 
CMSSTPAGSI EAQAP.VQVLE 
GSSLPEWVTD NAGTLIIFARV 
TTVYQGHTAL LQCEAQGDPK 



EIEDMPLFEP 
IAESDAGVYT 
PTWWYRNQM 
KLKFTPPPQP 
TRDDAGNYTC 
PLIQWKGKDR 



I TLGKSEFGEV 



PLSTKQKVAL CTQVALGMEH 
YHFRQAWVPL RWMSPEAILE 
AGKARLPQPE GCPSKLYRLM 



LSNNRFVHKD 
GDFSTKSDVW 
QRCWALSPKD 



DRLQDSGTFQ 
HIDGHPRPTY 
SSQNFTLSIA 
RPPHLRRATV 
RVFTAG3EER 
CHAAN1AGQR 
LISEDSRFEV 
QQCKEFDKEA 
IASNGPQGQI 
ILDPTKLGPR 
EGPGSPPPYK 
GGPLQNGQPS 
FLAKAQGLEE 
HYMVLEYVDL 
LAARNCLVSA 
AFGVLMWEVF 
RPSFSEIASA 



CVARDDVTGE 
QWFRDGTPL3 
DESFARWLA 
FANGSLLLTQ 
VTCLPPKGLP 
RQDVNITVAT 
FKNGTLRINS 
TVPC3ATGRE 
RAHVQLTVAV 
MHIFQNGSLV 
MIQ L V 
AEIQEEVALT 
GVAETLVLVK 
GDLKQFLRIS 
QRQVKVSALG 



L LRCEVEAPGP 



EARSANASFN 
DGQSNHTVSS 
PQDWVARYE 
VRPRNAGIYR 



LGDSTVDSKP 



VPSWLKKPOD 
VEVYDGTWYR 
KPTIKWERAD 
FITFKVEPER 
IHDVAPEDSG 
AAVAYI IAVL 
SLGSGPAATN 
SLQTKDEQQQ 
KSKDEKLKSQ 
LSKDVYNSEY 
ADDEVLADLQ 



ATGGGCTACC AGAGGCAGGA G 



GTTGTCAACT 
GGGTTTCCTT 
GTTTTATTGA 
AAAACTTTCG 
ATAGCAATGA 



ACAGTTACCT 
TCCCTCATCT 
TCACTGGGTC 



CGATTATAGG 
TGGGAATATT 
TAAAAGGAGG 
GCTTTCCAGG 
TAAGTTACAA 
TTGATCCTGA 
TTACTCTGCC 
CTACAGGTTT 
CACACATACC 



GCTTTTATTC 
GGCCCTCTCT 
GTATCTGCTC 
TATAATAGCT 
AAACGTGTTT 



TGGGTTTCAT 



GAGATTTAGA 
AGTCTGCTGC 
CTTATTCAAT 



TGACAGAGAA 



CTCTCTGTTC 
GGAGATACTT 
ATTGGTCGCC 



AACAACT CTG A 



CCTACCAGTC 
TTCAGTTTTT 
TGAGCAAAGT 
ACTTCATTAT 
TAGCAAAGCT 
TTGTAATGGC 
TGCAAA 



TACAGTTCTC 
GTGATTTCTG 
TTCACCCAAG 



GAGGTAATTG 
ACAGTGATGG 
GTTCTAGAAC 
TGTTATCTGA 
ATGCTTCCCA 



TAGAAGAACC 
TATTTATCTG 
GGGACTTATT 
ATGGTGTCAC 
CCAATGTGTT 
TCATCACTGT 
TCAATGGTGT 
AACTGTCTGA 
TTGGTGCTGT 



CACAGTAGCT 
TATATTCTTT 
TGAAAATTAC 
TGTCATTTTG 



AGCCACGCTT 
GCTCTGTGCA 
AGAACCAAGG 
GGTGATGGTT 



AAGTGGTCCC 
G CT ACATGTG 
TGCAGAAATG 
ACATACCCTA 
AATCTTTCAT 
GTGTCATTGC 
ACTCCCCTCA 
ACACACTCCG 



ACCATAACTC 
GCCTTATCCA 
GATACTTGAC 
ATGACCTGGT 
TGGAATGCTT 



TGATTGATTG 



CTTTTCCCTT 
TTTGGTCAAT 
GTATCCTTTT 
TTTTCAAAGA 
TGGACTTTCC 
TGGAAAGGTC 
AAGGGCAATT 
GCCCAATGCC 
CTTCTTAGTT 
TATC-TCCATC 
ATTTACTGGC 
AACATTTGGA 
TGTGACAAGA 
CATTGTTGTA 
CCTCGGGATA 
TCCATCAGCC 
GTCTTGTGTC 
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CABGACTGCA CCCATGGGCA GGAAATGTTC TACTGCTTTC CTGACAMTT CTCTCTCACA 1320 

AATACCTCAG AGTCTCATGT TCAGCAGACA ACACAACTTT CTACTTTAAA TATTAGTATC 1380 
TTTCAATGA 

5 Seg ID NO: 669 Protein sequence 
Protein Accession #s Eos sequence 

1 11 21 31 41 51 

ml I I 1 1 1 

10 MGYQRQEPVI PPQRDLDDRE TLVSEHEYKE KTCQSAALFN WNSIICSCI IGLPYSMKQA 60 

GFPLGILLLF WVSYVTDFSL VLLIKGGALS GTDTYQSLVN KTFGFPGYLL LSVLQFLYPF 120 

IAMISYNIIA GDTLSKVFQR IPGVDPENVF IGRHFIIGLS TVTFTLPLSL YRNIAKLGKV 180 

SLISTGLTTL ILGIVMARAI SLGPHIPKTE DAWVFAKPNA IQAVGVMSFA FICHHNSFLV 240 

YSSLEEPTVA KWSRLIHMSI VISVFICIFF ATCGYLTFTG FTQGDLFENY CRNDDLVTFG 300 

15 RFCYGVTVIL TYPMECFVTR EVIANVFFGG MLSSVFHIW TVMVITVATL VSLLIDCLGI 360 

VLELNGVLCA TPLIFIIPSA CYLKLSEEPR THSDKIMSCV MLPIGAWMV FGFVMAITNT 420 

QDCTHGQEMF YCFPDNFSLT NTSESHVQQT TQLSTLNISI FQ 

Seq ID NO: 670 DNA sequence 
20 Nucleic Acid Accession It: Eos 
Coding sequence: 1..1284 

1 11 21 31 

« I I I I 

25 ATGGGCTACC AGAGGCAGGA GCCTGTCATC CCC 

AAGCAAG CTG GGTTTCCTTT GGGAATATTG CTTTTATTCT GGGTTTCATA TGTTACAGAC 12 0 

TTTTCCCTTG TTTTATTGAT AAAAGGAGGG GCCCTCTCTG GAACAGATAC CTACCAGTCT 180 

TTGGTCAATA AAACTTTCGG CTTTCCAGGG TATCTGCTCC TCTCTGTTCT TCAGTTTTTG 240 

TATCCTTTTA TAGCAATGAT AAGTTACAAT ATAATAGCTG GAGATACTTT GAGCAAAGTT 300 

30 TTTCAAAGAA TCCCAGGAGT TGATCCTGAA AACGTGTTTA TTGGTCGCCA CTTCATTATT 3 60 

GGACTTTCCA CAGTTACCTT TACTCTGCCT TTATCCTTGT ACCGAAATAT AGCAAAGCTT 420 

GGAAAGGTCT CCCTCATCTC TACAGGTTTA ACAACTCTGA TTCTTGGAAT TGTAATGGCA 480 

AGGGCAATTT CACTGGGTCC ACACATACCA AAAACAGAAG ACGCTTGGGT ATTTGCAAAG 54 0 

CCCAATGCCA TTCAAGCGGT CGGGGTTATG TCTTTTGCAT TTATTTGCCA CCATAACTCC 60 0 

35 TTCTTAGTTT ACAGTTCTCT AGAAGAACCC ACAGTAGCTA AGTGGTCCCG CCTTATCCAT 660 

ATGTCCATCG TGATTTCTGT ATTTATCTGT ATATTCTTTG CTACATGTGG ATACTTGACA 72 0 

TTTACTGGCT TCACCCAAGG GGACTTATTT GAAAATTACT GCAGAAATGA TGACCTGGTA 780 

ACATTTGGAA GATTTTGTTA TGGTGTCACT GTCATTTTGA CATACCCTAT GGAATGCTTT 84 0 

GTGACAAGAG AGGTAATTGC CAATGTGTTT TTTGGTGGGA ATCTTTCATC GGTTTTCCAC 900 

40 ATTGTTGTAA CAGTGATGGT CAT CACTGT A GCCACGCTTG TGTCATTGCT GATTGATTGC 960 

CTCGGGATAG TTCTAGAACT CAATGGTGTG CTCTGTGCAA CTCCCCTCAT TTTTATCATT 1020 

CCATCAGCCT GTTATCTGAA ACTGTCTGAA GAACCAAGGA CACACTCCGA TAAGATTATG 1080 

TCTTGTGTCA TGCTTCCCAT TGGTGCTGTG GTGATGGTTT TTGGATTCGT CATGGCTATT 1140 

ACAAAT ACT C AAGACTGCAC CCATGGGCAG GAAATGTTCT ACTGCTTTCC TGACAATTTC 12 0 0 

45 TCTCTCACAA ATACCTCAGA GTCTCATGTT CAGCAGACAA CACAACTTTC TACTTTAAAT 1260 

ATTAGTATCT TTCAACTCGA GTAA 



50 

1 11 21 31 41 51 

I I I I I I 

MGYQRQEPVI PPQRGLPYSM KQAGFPLGIL LLFWVSYVTD FSLVLLIKGG ALSGTDTYQS 
LVNKTFGFPG YLLLSVLQFL YPFIAMISYN IIAGDTLSKV FQRIPGVDPE NVFIGRHFII 

55 GLSTVTFTLP LSLYRNIAKL GKVSLISTGL TTLILGIVMA RAISLGPHIP KTEDAWVFAK 
PNAIQAVGVM SFAFICHHNS FLVYSSLEEP TVAKWSRLIH MSIVISVFIC IFFATCGYLT 
FTGFTQGDLF ENYCRNDDLV TFGRFCYGVT VILTYPMECF VTREVIANVF FGGNLSSVFH 
IWTVMVITV ATLVSLLIDC LGIVLELNGV LCATPLIFII PSACYLKLSE EPRTHSDKIM 
SCVMLPIGAV VMVFGFVMAI TNTQDCTHGQ EMFYCFPDNF SLTNTSESHV QQTTQLSTLN 

60 ISIFQLE 



Seq ID NO: 672 DNA sequen 



1 11 21 31 41 51 

I I I I I 

ATGGGCTACC AGAGGCAGGA GCCTGTCATC CCGCCGCAGT TTTCCCTTGT T" 

AAAGGAGGGG CCCTCTCTGG AACAGATACC TACCAGTCTT TGGTCAATAA AACTTTCGGC 12 0 

70 TTTCCAGGGT ATCTGCTCCT CTCTGTTCTT CAGTTTTTGT ATCCTTTTAT AGCAATGATA 180 

AGTTACAATA TAATAGCTGG AGATACTTTG AGCAAAGTTT TTCAAAGAAT CCCAGGAGTT 24 0 

GATCCTGAAA ACGTGTTTAT TGGTCGCCAC TTCATTATTG GACTTTCCAC AGTTACCTTT 300 

ACTCTGCCTT TATCCTTGTA CCGAAATATA GCAAAGCTTG GAAAGGTCTC CCTCATCTCT 3 60 

ACAGGTTTAA CAACTCTGAT TCTTGGAATT GTAATGGCAA GGGCAATTTC ACTGGGTCCA 42 0 

75 CACATACCAA AAACAGAAGA CGCTTGGGTA TTTGCAAAGC CCAATGCCAT TCAAGCGGTC 480 

GGGGTTATGT CTTTTGCATT TATTTGCCAC CATAACTCCT TCTTAGTTTA CAGTTCTCTA 540 

GAAGAACCCA CAGTAGCTAA GTGGTCCCGC CTTATCCATA TGTCCATCGT GATTTCTGTA 600 

TTTATCTGTA TATTCTTTGC TACATGTGGA TACTTGACAT TTACTGGCTT CACCCAAGGG 660 

GACTTATTTG AAAATTACTG CAGAAATGAT GACCTGGTAA CATTTGGAAG ATTTTGTTAT 720 

80 GGTGTCACTG TCATTTTGAC ATACCCTATG GAATGCTTTG TGACAAGAGA GGTAATTGCC 7 80 

AATGTGTTTT TTGGTGGGAA TCTTTCATCG GTTTTCCACA TTGTTGTAAC AGTGATGGTC 840 

ATCACTGTAG CCACGCTTGT GTCATTGCTG ATTGATTGCC TCGGGATAGT TCTAGAACTC 900 

AATGGTGTGC TCTGTGCAAC TCCCCTCATT TTTATCATTC CATCAGCCTG TTATCTGAAA 960 

CTGTCTGAAG AACCAAGGAC ACACTCCGAT AAGATTATGT CTTGTGTCAT GCTTCCCATT 1020 

85 GGTGCTGTGG TGATGGTTTT TGGATTCGTC ATGGCTATTA CAAATACTCA AGACTGCACC 1080 

CATGGGCAGG AAATGTTCTA CTGCTTTCCT GACAATTTCT CTCTCACAAA TACCTCAGAG 1140 

TCTCATGTTC AGCAGACAAC ACAACTTTCT ACTTTAAATA TTAGTATCTT TCAACTCGAG 12 00 
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Seg ID NO. 673 Protein sequence 
Protein Accession #: Eos sequence 

1 11 21 31 41 51 

I I I I I I 

MGYQRQEPVI PPQFSLVLLI KGGALSGTDT YQSLVNKTFG FPGYLLLSVL QFLYPFIAMI 60 
SYNI IAGDTL SKVFQRIPGV DPENVFIGRH FIIGLSTVTF TLPLSLYRNI AKLGKVSLIS 120 
TGLTTLILGI VMARAISLGP HIPKTBDAWV FAKPNAIQAV GVMSFAFICH HNSFLVYSSL 180 
EEPTVAKWSR LIHMSIVISV FICIFFATCG YLTFTGFTQG DLFENYCRND DLVTFGRFCY 240 
GVTVILTYPM ECFVTREVIA NVFFGGNLSS VFHIWTVMV ITVATLVSLL IDCLGIVLEL 300 
NGVLCATPLI FIIPSACYLK LSEEPRTHSD KIMSCVMLPI GAWMVFGFV MAITNTQDCT 360 
HGQEMFYCFP DNFSLTNTSE SHVQQTTQLS TLKISIFQLE 



CCTGAAAACG TGTTTATTGG TCGCCACTTC ATTATTGGAC TTTCCACAGT TACCTTTACT 240 

CTGCCTTTAT CCTTGTACCG AAATATAGCA AAGCTTGGAA AGGTCTCCCT CATCTCTACA 300 

GGTTTAACAA CTCTGATTCT TGGAATTGTA ATGGCAAGGG CAATTTCACT GGGTCCACAC 360 

ATACCAAAAA CAGAAGACGC TTGGGTATTT GCAAAGCCCA ATGCCATTCA AGCGGTCGGG 420 

GTTATGTCTT TTGCATTTAT TTGCCACCAT AACTCCTTCT TAGTTTACAG TTCTCTAGAA 480 

GAACCCACAG TAGCTAAGTG GTCCCGCCTT ATCCATATGT CCATCGTGAT TTCTGTATTT 540 

ATCTGTATAT TCTTTGCTAC ATGTGGATAC TTGACATTTA CTGGCTTCAC CCAAGGGGAC 60 0 

TTATTTGAAA AT T ACTGC AG AAATGATGAC CTGGTAACAT TTGGAAGATT TTGTTATGGT 660 

GTCACTGTCA TTTTGACATA CCCTATGGAA TGCTTTGTGA CAAGAGAGGT AATTGCCAAT 72 0 

GTGTTTTTTG GTGGGAATCT TTCATCGGTT TTCCACATTG TTGTAACAGT GATGGTCATC 780 

ACTGTAGCCA CGCTTGTGTC ATTGCTGATT GATTGCCTCG GGATAGTTCT AGAACTCAAT 840 

GGTGTGCTCT GTGCAACTCC CCTCATTTTT ATCATTCCAT CAGCCTGTTA TCTGAAACTG 900 

TCTGAAGAAC CAAGGACACA CTCCGATAAG ATTATGTCTT GTGTCATGCT TCCCATTGGT 960 

GCTGTGGTGA TGGTTTTTGG ATTCGTCATG GCTATTACAA ATACTCAAGA CTGCACCCAT 1020 
GGGCAGGAAA TGTTCTACTG CTTTCCTGAC AATTT 



Seq ID NO: 675 Protein sequence 

1 11 21 31 41 51 

I I I I I I 

MGYQRQEPVI PPQVNKTFGF PGYLLLSVLQ FLYPFIAMIS YNIIAGDTLS KVFQRIPGVD 

PENVFIGRHF I IGLSTVTFT LPLSLYRNIA KLGKVSLIST GLTTLILGIV MARAISLGPH 

IPKTEDAWVF AKPNAIQAVG VM SFAFICHH NSFLVYSSLE EPTVAKWSRL IHMSIVISVF 

ICIFFATCGY LTFTGFTQGD LFENYCRNDD LVTFGRFCYG VTVILTYPME CFVTREVIAN 

VFFGGNLSSV FHIWTVMVI TVATLVSLLI DCLGIVLELN GVLCATPLIF IIPSACYLXL 

SEEPRTHSDK IMSCVMLPIG AWMVFGFVM AITNTQDCTH GQEMFYCFPD NFSLTNTSES 
HVQQTTQLST LNISIFQLE 



Seq I 



I I I I I I 

AGGAATCTGC GCTCGGGTTC CGCAGATGCA GAGGTTGAGG TGGCTGCGGG ACTGGAAGTC 60 

ATCGGGCAGA GGTCTCACAG CAGCCAAGGA ACCTGGGGCC CGCTCCTCCC CCCTCCAGGC 120 

CATGAGGATT CTGCAGTTAA TCCTGCTTGC TCTGGCAACA GGGCTTGTAG GGGGAGAGAC 180 

CAGGATCATC AAGGGGTTCG AGTGCAAGCC TCACTCCCAG CCCTGGCAGG CAGCCCTGTT 240 

CGAGAAGACG CGGCTACTCT GTGGGGCGAC GCTCATCGCC CCCAGATGGC TCCTGACAGC 300 

AGCCCACTGC CTCAAGCCCC GCTACATAGT TCACCTGGGG CAGCACAACC TCCAGAAGGA 3 60 

GGAGGGCTGT GAGCAGACCC GGACAGCCAC TGAGTCCTTC CCCCACCCCG GCTTCAACAA 42 0 

CAGCCTCCCC AACAAAGACC ACCGCAATGA CATCATGCTG GTGAAGATGG CATCGCCAGT 480 

CTCCATCACC TGGGCTGTGC GACCCCTCAC CCTCTCCTCA CGCTGTGTCA CTGCTGGCAC 540 

CAGCTGCCTC ATTTCCGGCT GGGGCAGCAC GTCCAGCCCC CAGTTACGCC TGCCTCACAC 600 

CTTGCGATGC GCCAACATCA CCATCATTGA GCACCAGAAG TGTGAGAACG CCTACCCCGG 660 

CAACATCACA GACACCATGG TGTGTGCCAG CGTGCAGGAA GGGGGCAAGG ACTCCTGCCA 72 0 

GGGTGACTCC GGGGGCCCTC TGGTCTGTAA CCAGTCTCTT CAAGGCATTA TCTCCTGGGG 780 

CCAGGATCCG TGTGCGATCA CCCGAAAGCC TGGTGTCTAC ACGAAAGTCT GCAAATATGT 840 

GGACTGGATC CAGGAGACGA TGAAGAACAA TTAGACTGGA CCCACCCACC ACAGCCCATC 900 

ACCCTCCATT TCCACTTGGT GTTTGGTTCC TGTTCACTCT GTTAATAAGA AACCCTAAGC 960 

CAAGACCCTC TACGAACATT CTTTGGGCCT CCTGGACTAC AGGAGATGCT GTCACTTAAT 1020 

AAT CAACCTG GGGTTCGAAA TCAGTGAGAC CTGGATTCAA ATTCTGCCTT GAAATATTGT 1080 

GACTCTGGGA ATGACAACAC CTGGTTTGTT CTCTGTTGTA TCCCCAGCCC CAAAGACAGC 1140 
TCCTGGCCAT ATATCAAGGT TTCAATAAAT ATTTGCTAAA TGAGTG 



J ID NO: 677 Prot 
stein Accession #: NP_0 06844.1 



MRILQLILLA LATGLVGGET RIIKGFECKP HSQPWQAALF EKTRLLCGAT LIAPRWLLTA 
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AHCLKPRYIV HLGQHNLQKE EGCEQTRTAT ESFPHPGFNN SLPNKDHRND IMLVKMASPV 

SITWAVRPLT LSSRCVTAGT SCLISGWGST SSPOLRLPHT LRCANITI IE HQKCENAYPG 

NITDTMVCAS VQEGGKDSCQ GDSGGPLVCN QSLQGIISWG ODPCAITRKP GVYTKVCKYV 
DWIQETMKMN 



III | I I" 

ATGTGCAGCA ATGGACGGTG CATCCCGGGC GCCTGGCAGT GTGACGGGCT GCCTGACTGC 
TTCGACAAGA GTGATGAGAA GGAGTGCCCC AAGGCTAAGI CGAAATGTGG CCCGACCTTC 
TTCCCCTGTG CCAGCGGCAT CCATTGCATC ATTGGTCGCT TCCGGTGCAA TGGGTTTGAG 
GACTGTCCCG ATGGCAGCGA TGAAGAGAAC TGCACAGCAA ACCCTCTC-CT TTGCTCCACC 
GCCCGCTACC ACTGCAAGAA CGGCCTCTGT ATTGACAAGA GCTTCATCTG CGATGGACAG 
AATAACTGTC AAGACAACAG TGATGAGGAA AGCTGTGAAA GTTCTCAAGA ACCCGGCAGT 
GGGCAGGTGT TTGTGACTTC AGAGAACCAA CTTGTGTATT ACCCCAGCAT CACCTATGCC 
ATCATCGGCA GCTCCGTCAT TTTTGTGCTG GTGGTGGCCC TGCTGGCACT GGTCTTGCAC 
CACCAGCGGA AGCGGAACAA CCTCATGACG CTGCCCGTGC ACCGGCTGCA GCACCCTGTG 
CTGCTGTCCC GCCTGGTGGT CCTGGACCAC CCCCACCACT GCAACGTCAC CTACAACGTC 
AATAATGGCA TCCAGTATGT GGCCAGCCAG GCGGAGCAGA ATGCGTCGGA AGTAGGCTCC 
CCACCCTCCT ACTCCGAGGC CTTGCTGGAC CAGAGGCCTG CGTGGTATGA CCTTCCTCCA 
CCGCCCTACT CTTCTGACAC GGAATCICTG AACCAAGCCG ACCTGCCCCC CTACCGCTCC 
CGGTCCGGGA GTGCCAACAG TGCCAGCTCC CAGGCAGCCA GCAGCCTCCT CAGCGTGGAA 
GACACCAGCC ACAGCCCGGG GCAGCCTGGC CCCCAGGAGG GCACTGCTGA GCCCAGGGAC 
TCTGAGCCGA GCCAGGGCAC TGAAGAAGTA TAA 

Seg ID NO: 67 9 Probein sequence 



1 11 21 31 41 51 
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MCSNGRCIPG AWQCDGLPDC FDKSDEKECP KAKSKCGPTF FPCASGIHCI IGRFRCNGFE 60 

35 DCPDGSDEEN CTANPLLCST ARYHCKNGLC IDKSFICDGQ NNCQDNSDEE SCESSQEPGS 120 

GQVFVTSENQ LVYYPSITYA IIGSSVIFVL WALLALVLH HQRKRNNLMT LPVHRLQHPV 180 

LLSRLWLDH PHHCNVTYNV NNGIQYVASQ AEQNASEVGS PPSYSEALLD QRPAWYDLPP 240 

PPYSSDTESL NQADLPPYRS RSG3ANSASS QAASSLLSVE DTSHSPGQPG PQEGTAEPRD 300 
SEPSQGTEEV 

40 

Seq ID NO: 680 DNA sequence 
Nucleic Acid Accession ft: S78203 .1 
Coding sequence: 1..2190 

45 1 11 21 31 41 51 

ATGAATCCTT TCCAQAAAAA TGAGTCCAAG GAAACTCTTT TTTCACCTGT CTCCATTGAA 60 

GAGGTACCAC CTCGACCACC TAGCCCTCCA AAGAAGCCAT CTCCGACAAT CTGTGGCTCC 120 

AACTATCCAC TGAGCATTGC CTTCATTGTG GTGAATGAAT TCTGCGAGCG CTTTTCCTAT 180 

50 TATGGAATGA AAGCTGTGCT GATCCTGTAT TTCCTGTATT TCCTGCACTG GAATGAAGAT ™> 

ACCTCCACAT CTATATACCA TGCCTTCAGC AGCCTCTGTT ATTTTACTCC CATCCTGGGA 300 

GCAGCCATTG CTGACTCGTG GTTGGGAAAA TTCAAGACAA TCATCTATCT CTCCTTGGTG 360 

TATGTGCTTG GCCATGTGAT CAAGTCCTTG GGTGCCTTAC CAATACTGGG AGGACAAGTG 42 

GTACACACAG TCCTATCATT GATCGGCCTG AGTCTAATAG CTTTGGGGAC AGGAGGCATC 480 

55 AAACCCTGTG TGGCAGCTTT TGGTGGAGAC CAGTTTGAAG AAAAACATGC AGAGGAACGG 540 

ACTAGATACT TCTCAGTCTT CTACCTGTCC AT C AATGCAG GGAGCTTGAT TTCTACATTT 600 

ATCACACCCA TGCTGAGAGG AGATGTGCAA TGTTTTGGAG AAGACTGCTA TGCATTGGCT 660 

TTTGGAGTTC CAGGACTGCT CATGGTAATT GCACTTGTTG TGTTTGCAAT GGGAAGCAAA 72 

ATATACAATA AACCACCCCC TGAAGGAAAC ATAGTGGCTC AAGTTTTCAA ATGTATCTGG 7 80 

60 TTTGCTATTT CCAATCGTTT CAAGAACCGT TCTGGAGACA TTCCAAAGCG ACAGCACTGG 84 0 

CTAGACTGGG CAGCTGAGAA ATATCCAAAG CAGCTCATTA TGGATGTAAA GGCACTGACC 900 

AGGGTACTAT TCCTTTATAT CCCATTGCCC ATGTTCTGGG CTCTTTTGGA TCAGCAGGGT 960 

TCACGATGGA CTTTGCAAGC CAT CAGG ATG AATAGGAATT TGGGGTTTTT TGTGCTTCAG 102 0 

CCGGACCAGA TGCAGGTTCT AAATCCCTTT CTGGTTCTTA TCTTCATCCC GTTGTTTGAC 1080 

65 TTTGTCATTT ATCGTCTGGT CTCCAAGTGT GGAATTAACT TCTCATCACT TAGGAAAATG 1140 

GCTGTTGGTA TGATCCTAGC GTGCCTGGCA TTTGCAGTTG CGGCAGCTGT AG AG AT AAAA 1200 

ATAAATGAAA TGGCCCCAGC CCAGTCAGGT CCCCAGGAGG TTTTCCTACA AGTCTTGAAT 12 60 

CTGGCAGATG ATGAGGTGAA GGTGACAGTG GTGGGAAATG AAAACAATTC TCTGTTGATA 1320 

GAGTCCATCA AATCCTTTCA GAAAACACCA CACTATTCCA AACTGCACCT GAAAACAAAA 1380 

70 AGCCAGGATT TTCACTTCCA CCTGAAATAT CACAATTTGT CTCTCTACAC ^GAGCATTCT 1440 

GTGCAGGAGA AGAACTGGTA CAGTCTTGTC ATT CGTGAAG ATGGGAACAG TATCTCCAGC 1500 

ATGATGGTAA AGGATACAGA AAGCAAAACA ACCAATGGGA TGACAACCGT GAGGTTTGTT 1560 

AACACTTTGC AT AAAGATG T CAACATCTCC CTGAGTACAG ATACCTCTCT CAATGTTGGT 1620 

GAAGACTATG GTGTGTCTGC TTATAGAACT GTG CAAAGAG GAGAATACCC TGCAGTGCAC 1680 

75 TGTAGAACAG AAGATAAGAA CTTTTCTCTG AATTTGGGTC TTCTAGACTT TGGTGCAGCA 1740 

TATCTGTTTG TTATTACTAA TAACACCAAT CAGGGTCTTC AGGCCTGGAA GATTGAAGAC 1800 

ATTCCAGCCA ACAAAATGTC CATTGCGTGG CAGCTACCAC AATATGCCCT GGTTACAGCT 1860 

GGGGAGGTCA TGTTCTCTGT CACAGGTCTT GAGTTTTCTT ATTCTCAGGC TCCCTCTAGC 192 0 

ATGAAATCTG TGCTCCAGGC AGCTTGGCTA TTGACAATTG CAGTTGGGAA TATCATCGTG 1980 

80 CTTGTTGTGG CACAGTTCAG TGGCCTGGTA CAGTGGGCCG AATTCATTTT GTTTTCCTGC 2040 

CTCCTGCTGG TGATCTGCCT GATCTTCTCC ATCATGGGCT ACTACTATGT TCCTGTAAAG 2100 

ACAGAGGATA TGCGGGGTCC AGCAGATAAG CACATTCCTC ACATCCAGGG GAACATGATC 2160 
AAACTAGAGA CCAAGAAGAC AAAACTCTGA 
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MNPFQKNESK ETLFSPVSIE EVPPRPPSPP KKPSPTICG3 NYPLSIAFIV VNEFCERFSY SO 

YGMKAVLILY FLYFLHWNED TSTSIYHAFS SLCYFTPILG AAIADSNLGK FKTIIYLSLV 120 

YVLGHVIKSL GALPILGGQV VHTVLSLIGL SLIALGTGGI KPCVAAFGGD QFEEKHABER 1B0 

TRYFSVFYLS INAGSLISTF ITPMLRGDVQ CFGEDCYALA FGVPGLLMVI ALWFAMGSK 240 

IYNKPPPEGN IVAQVFKCIW FAISNRFKNR SGDIPKRQHW LDWAAEKYPK QLIMDVKALT 300 

RVLFLYIPLP MFWALLDQQG SRWTLQAIRM NRNLGFFVLQ PDQMQVLNPF LVLIFIPLFD 3S0 

FVIYRLVSKC GINFSSIjRKM AVGMILACLA FAVAAAVEIX INEMAPAQSG PQEVFLQVLN 420 

LADDEVKVTV VGNENNSLLI ESIKSFQ/TP HYS It ih FHLK\ HSLSLYTEHS 490 

VQEKNWYSLV IREDGNSISS MMVKDTESKT TNGMTTVRFV NTLHKDVNIS L3TDTSLNVG 540 

EDYGVSAYRT VQRGEYPAVH CRTEDKNFSL NLGLLDFGAA YLFVITNNTN QGLQAWKIED 600 

IPAKKMSIAW QLPQYALVTA GEVMFSVTGL EFSYSQAPSS MKSVLQAAWL LTIAVGNIIV 6S0 

LWAQFSGLV QWAEFILFSC LLLVICLIFS IMGYYYVPVK TEDMRGPADK HIPHIQGNMI 720 



1 11 21 31 41 51 

TCGCTTTGTG ATTCTTGATC CGGAACTTTG TCACCCAGGA ACCCCGGAAG AGGTAGCTCA 
CGCGATAGAA ACGTGTTCGC TTGCCCAGAA GAAGGGAAGG CGCGAGTGAG GAAAGGAGGT 
ACTGTAGATG CCCTCCAAAT CCTTGGTTAT GGAATATTTG GCTCATCCCA GTACACTCGG 
CTTGGCTGTT GGAGTTGCTT GTGGCATGTG CCTGGGCTGG AGCCTTCGAG TATGCTTTGG 
GATGCTCCCC AAAAGCAAGA CGAGCAAGAC ACACACAGAT ACTGAAAGTG AAGCAAGCAT 
CTTGGGAGAC AGCGGGGAGT ACAAGATGAT TCTTGTGGTT CGAAATGACT TAAAGATGGG 
AAAAGGGAAA GTGGCTGCCC AGTGCTCTCA TGCTGCTGTT TCAGCCTACA AGCAGATTCA 
AAGAAGAAAT CCTGAAATGC TCAAACAATG GGAATACTGT GGCCAGCCCA AGGTGGTGGT 
CAAAGCTCCT GATGAAGAAA CCCTGATTGC ATTATTGGCC CATGCAAAAA TGCTGGGACT 
GACTGTAAGT TTAATTCAAG ATGCTGGACG TACTCAGATT GCACCAGGCT CTCAAACTGT 
CCTAGGGATT GGGCCAGGAC CAGCAGACCT AATTGACAAA GTCACTGGTC ACCTAAAACT 
TTACTAGGTG GACTTTGATA TGACAACAAC CCCTCCATCA CAAGTGTTTG AAGCCTGTCA 
GATTCTAACA ACAAAAGCTG AATTTCTTCA CCCAACTTAA ATGTTCTTGA GATGAAAATA 
AAACCTATTC CCATGTTCTA AAAAAA 

Seq ID NO: 683 Protein sequence 



MPSKSLVMEY LAHPSTLGLA VGVACGMCLG WSLRVCFGML PKSKTSKTKT DTESEASILG 
DSGEYKMILV VRNDLKMGKG KVAAQCSHAA VSAYKQIQRR NPEMLKQWEY CGQPXWVKA 
PDEETLIALL AHAKMLGLTV SLIQDAGRTQ IAPGSQTVLG IGPGPADLID KVTGHLKLY 



I I I I I 1 

CGGAACGAGG GCAACCTGCA CAGCCATGCC CGGGCAAGAA CTCAGGACGG TGAATGGCTC 60 

TCAGATGCTC CTGGTGTTGC TGGTGCTCTC GTGGCTGCCG CATGGGGGCG CCCTGTCTCT 12 0 

55 GGCCGAGGCG AGCCGCGCAA GTTTCCCGGG ACCCTCAGAG TTGCACTCCG AAGACTCCAG 180 

ATTCCGAGAG TTGCGGAAAC GCTACGAGGA CCTGCTAACC AGGCTGCGGG CCAACCAGAG 240 

CTGGGAAGAT TCGAACACCG ACCTCGTCCC GGCCCCTGCA GT CCGGATAC TCACGCCAGA 300 

AGTGCGGCTG GGATCCGGCG GCCACCTGCA CCTGCGTATC TCTCGGGCCG CCCTTCCCGA 360 

GGGGCTCCCC GAGGCCTCCC GCCTTCACCG GGCTCTGTTC CGGCTGTCCC CGACGGCGTC 42 0 

60 AAGGTCGTGG GACGTGACAC GACCGCTGCG GCGTCAGCTC AGCCTTGCAA GACCCCAAGC 480 

GCCCGCGCTG CACCTGCGAC TGTCGCCGCC GCCGTCGCAG TCGGACCAAC TGCTGGCAGA 540 

ATCTTCGTCC GCACGGCCCC AGCTGGAGTT GCACTTGCGG CCGCAAGCCG CCAGGGGGCG 600 

CCGCAGAGCG CGTGCGCGCA ACGGGGACGA CTGTCCGCTC GGGCCCGGGC GTTGCTGCCG 660 

TCTGCACACG GTCCGCGCGT CGCTGGAAGA CCTGGGCTGG GCCGATTGGG TGCTGTCGCC 72 0 

65 ACGGGAGGTG CAAGTGACCA TGTGCATCGG CGCGTGCCCG AGCCAGTTCC GGGCGGCAAA 780 

CATGCACGCG CAGATCAAGA CGAGCCTGCA CCGCCTGAAG CCCGACACGG AGCCAGCGCC 840 

CTGCTGCGTG CCCGCCAGCT ACAATCCCAT GGTGCTCATT CAAAAGACCG ACACCGGGGT 900 

GTCGCTCCAG ACCTATGATG ACTTGTTAGC CAAAGACTGC CACTGCATAT GAGCAGTCCT 960 

GGTCCTTCCA CTGTGCACCT GCGCGGGGGA GGCGACCTCA GTTGTCCTGC CCTGTGGAAT 1020 

70 GGGCTCAAGG TTCCTGAGAC ACCCGATTCC TGCCCAAACA GCTGTATTTA TATAAGTCTG 1080 

TTATTTATTA TTAATTTATT GGGGTGACCT TCTTGGGGAC TCGGGGGCTG GTCTGATGGA 1140 

r TATTTAAAAC TCTGGTGATA AAAATAAAGC TGTCTGAACT GTTAAAAAAA 1200 



1 11 21 31 41 51 
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MPGQELRTVN GSQMLLVLLV LSWLPHGGAL SLAEASRASF PGPSELHSED SRF 
EDLLTRLRAN QSWEDSNTDL VPAPAVRILT PEVRLGSGGK LHLRISRAAL PEGLPEASRL 
HRALFRLSPT ASRSWDVTRP LRRQLSLARP QAPALHLRLS PPPSQSDQLL AESSSARPQL 
ELHLRPQAAR GRRRARARNG DDCPLGPGRC CRLHTVRASL EDLGWADWVL SPREVQVTMC 
IGACPSQFRA ANMHAQIKTS LHRLKPDTEP APCCVPASYN PMVLIQKTDT GVSLQTYDDL 
LAKDCHCI 

Seq ID NO; 686 DNA sequence 
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ACCAAATCAA CCATAGGTCC AAGAACAATT GTCTCTGGAC GGCAGCTATG CGACTCACCG 
TGCTGTGTGC TGTGTGCCTG CTGCCTGGCA GCCTGGCCCT GCCGCTGCCT CAGGAGGCGG 
3 TGAGCTACAG TGGGAACAGG CTCAGGACTA TCTCAAGAGA TTTTATCTCT 



TCTTTGGCCT ACCTATAACT GGAATGTTAA ACTCCCGCGT CATAGAAATA ATGCAGAAGC 300 
CCAGATGTGG AGTGCCAGAT GTTGCAGAAT ACTCACTATT TCCAAATAGC C 



TGGATCGATT AGTGTCAAAG GCTTTAAACA TGTGGGGCAA AGAGATCCCC CTGCATTTCA 480 

GGAAAGTTGT ATGGGGAACT GCTGACATCA TGATTGGCTT TGCGCGAGGA GCTCATGGGG 540 

ACTCCTACCC ATTTGATGGG CCAGGAAACA CGCTGGCTCA TGCCTTTGCG CCTGGGACAG SOO 

GTCTCGGAGG AGATGCTCAC TTCGATGAGG ATGAACGCTG GACGGATGGT AGCAGTCTAG SSO 

GGATTAACTT CCTGTATGCT GCAACTCATG AACTTGGCCA TTCTTTGGGT ATGGGACATT 720 

CCTCTGATCC TAATGCAGTG ATGTATCCAA CCTATGGAAA TGGAGATCCC CAAAATTTTA 780 

AACTTTCCCA GGATGATATT AAAGGCATTC AGAAACTATA TGGAAAGAGA AGTAATTCAA 840 

GAAAGAAATA GAAACTTCAG GCAGAACATC CATTCATTCA TTCATTGGAT TGTATATCAT 900 

TGTTGCACAA T CAGAAT TGA TAAGCACTGT TCCTCCACTC CATTTAGCAA TTATGTCACC 960 

CTTTTTTATT GCAGTTGGTT TTTGAATGTC TTTCACTCCT TTTATTGGIT AAACTCCTTT 1020 

ATGGTGTGAC TGTGTCTTAT TCCATCTATG AGCTTTGTCA GTGCGCGTAG ATGTCAATAA 1080 
ATGTTACATA CACAAATAAA TAAAATGTTT ATTCCATGGT AAATTTA 

Seq ID NO: 687 Protein s 
Protein Accession it: NP_0 

1 11 21 31 41 51 

I I I I I I 

MRLTVLCAVC LLPGSLALPL PQEAGGMSEL QWEQAQDYLK RFYLYDSETK NANSLEAKLK 
EMQKFFGLPI TGMLNSRVIE IMQKPRCGVP DVAEYSLFPN SPKWTSKWT YRIVSYTRDL 
PHITVDRLVS KALNMWGKEI PLHFRKWWG TADIMIGFAR GAHGDSYPFD GPGNTLAHAF 
APGTGLGGDA HFDEDERWTD GSSLGINFLY AATHELGHSL GMGHSSDPNA VMYPTYGNGD 
PONFKLSQDD IKGIQKLYGK RSNSRKK 

Seq ID NO: 688 DNA sequence 
Nucleic Acid Accession #: NMJJ05221.3 
Coding sequence: 1..870 

1 11 21 31 41 51 

I I I I I I 

ATGACAGGAG TGTTTGACAG AAGGGTCCCC AGCATCCGAT CCGGCGACTI CCAAGCTCCG 

ttccagacgt ccgcagctat gcaccatccg tctcaggaat cgccaacttt gcccgagtct 
tcagctaccg attctgacta ctacagccct acggggggag ccccgcacgg ctactgctct 
- ai :tcgg cttcctatgg caaagctctc aacccctacc agtatcagta tcacggcgtg 
aacggctccg ccgggagcta cccagccaaa gcttatgccg actatagcta cgctagctcc 
taccaccagt acggcggcgc ctacaaccgc gtcccaagcg ccaccaacca gccagagaaa 
gaagtgaccg agcccgaggt gagaatggtg aatggcaaac caaagaaagt tcgtaaaccc 

AGGACTATTT ATTCCAGCTT TCAGCTGGCC GCATTACAGA GAAGGTTTCA G AAG ACT CAG 
TACCTCGCCT TGCCGGAACG CGCCGAGCTG GCCGCCTCGC TGGGATTGAC ACAAACACAG 
GTGAAAATCT GGTTTCAGAA CAAAAGATCC AAGATCAAGA AGATCATGAA AAA CGGGGAG 
ATGCCCCCGG AGCACAGTCC CAGCTCCAGC G 

CCAGCGGTGT GGGAGCCCCA GGGCTCGTCC CGCTCGCTCA GCCACCACCC T 
CCTCCGACCT CCAACCAGTC CCCAGCGTCC AGCTACCTGG AGAACTCTGC ATCCTGGTAC 
ACAAGTGCAG CCAGCTCAAT CAATTCCCAC CTGCCGCCGC CGGGCTCCTT ACAGCACCCG 
CTGGCGCTGG CCTCCGGGAC A 

Seq ID NO: 689 Pi 




PPTSNQSPAS SYLENSASWY TSAASSINSH LPPPGSLQHP LALASGTLY 
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It is understood that the examples described above in no way serve to limit the true 
scope of this invention, but rather are presented for illustrative purposes. All publications, 
sequences of accession numbers, and patent applications cited in this specification are herein 
ncorporated by reference as if each individual publication or patent application were 
specifically and individually indicated to be incorporated by reference. 
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WHAT IS CLAIMED IS: 

1 1 . A method of detecting a lung cancer-associated transcript in a cell 

2 from a patient, the method comprising contacting a biological sample from the patient with a 

3 polynucleotide that selectively hybridizes to a sequence at least 80% identical to a sequence 

4 as shown in Tables 1 A- 1 6. 

1 2. The method of claim 1, wherein the polynucleotide selectively 

2 hybridizes to a sequence at least 95% identical to a sequence as shown in Tables 1 A-16. 

1 3 . The method of claim 1 , wherein the biological sample is a tissue 

2 sample. 

1 4. The method of claim 1 , wherein the biological sample comprises 

2 isolated nucleic acids. 

1 5. The method of claim 4, wherein the nucleic acids are mRNA. 

1 6. The method of claim 4, further comprising the step of amplifying 

2 nucleic acids before the step of contacting the biological sample with the polynucleotide. 

1 7. The method of claim 1, wherein the polynucleotide comprises a 

2 sequence as shown in Tables 1 A-16. 

1 8 . The method of claim 1 , wherein the polynucleotide is labeled. 

1 9. The method of claim 8, wherein the label is a fluorescent label. 

1 10. The method of claim 1, wherein the polynucleotide is immobilized on 

2 a solid surface. 

1 11. The method of claim 1, wherein the patient is undergoing a therapeutic 

2 regimen to treat lung cancer. 

1 12. The method of claim 1 , wherein the patient is suspected of having lung 

2 cancer. 

1 1 3 . A method of monitoring the efficacy of a therapeutic treatment of lung 

2 cancer, the method comprising the steps of: 
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3 (i) providing a biological sample from a patient undergoing the therapeutic 

4 treatment; and 

5 (ii) determining the level of a lung cancer-associated transcript in the 

6 biological sample by contacting the biological sample with a polynucleotide that selectively 

7 hybridizes to a sequence at least 80% identical to a sequence as shown in Tables 1A-16, 

8 thereby monitoring the efficacy of the therapy. 

1 14. The method of claim 13, further comprising the step of: (iii) comparing 

2 the level of the lung cancer-associated transcript to a level of the lung cancer-associated 

3 transcript in a biological sample from the patient prior to, or earlier in, the therapeutic 

4 treatment. 

1 15. The method of claim 1 3 , wherein the patient is a human. 

1 1 6. A method of monitoring the efficacy of a therapeutic treatment of lung 

2 cancer, the method comprising the steps of: 

3 (i) providing a biological sample from a patient undergoing the therapeutic 

4 treatment; and 

5 (ii) determining the level of a lung cancer-associated antibody in the biological 

6 sample by contacting the biological sample with a polypeptide encoded by a polynucleotide 

7 that selectively hybridizes to a sequence at least 80% identical to a sequence as shown in 

8 Tables 1A-16, wherein the polypeptide specifically binds to the lung cancer-associated 

9 antibody, thereby monitoring the efficacy of the therapy. 



1 17. The method of claim 16, further comprising the step of: (iii) comparing 

2 the level of the lung cancer-associated antibody to a level of the lung cancer-associated 

3 antibody in a biological sample from the patient prior to, or earlier in, the therapeutic 

4 treatment. 

1 18. The method of claim 16, wherein the patient is a human. 

1 1 9. A method of monitoring the efficacy of a therapeutic treatment of lung 

2 cancer, the method comprising the steps of: 

3 (i) providing a biological sample from a patient undergoing the therapeutic 

4 treatment; and 
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5 (ii) determining the level of a lung cancer-associated polypeptide in the 

6 biological sample by contacting the biological sample with an antibody, wherein the antibody 

7 specifically binds to a polypeptide encoded by a polynucleotide that selectively hybridizes to 

8 a sequence at least 80% identical to a sequence as shown in Tables 1 A- 16, thereby 

9 monitoring the efficacy of the therapy. 

1 20. The method of claim 1 9, further comprising the step of: (iii) comparing 

2 the level of the lung cancer- associated polypeptide to a level of the lung cancer-associated 

3 polypeptide in a biological sample from the patient prior to, or earlier in, the therapeutic 

4 treatment. 

1 21. The method of claim 1 9, wherein the patient is a human. 

1 22. An isolated nucleic acid molecule consisting of a polynucleotide 

2 sequence as shown in Tables 1A-16. 

1 23. The nucleic acid molecule of claim 22, which is labeled. 

1 24. The nucleic acid of claim 23, wherein the label is a fluorescent label 

1 25. An expression vector comprising the nucleic acid of claim 22. 

1 26. A host cell comprising the expression vector of claim 25. 

1 27. An isolated polypeptide which is encoded by a nucleic acid molecule 

2 having polynucleotide sequence as shown in Tables 1A-16. 

1 28. An antibody that specifically binds a polypeptide of claim 27. 

1 29. The antibody of claim 28, further conjugated to an effector component. 

1 30. The antibody of claim 29, wherein the effector component is a 

2 fluorescent label. 

1 3 1 . The antibody of claim 29, wherein the effector component is a 

2 radioisotope or a cytotoxic chemical. 

1 32. The antibody of claim 29, which is an antibody fragment. 
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1 33. The antibody of claim 29, which is a humanized antibody 

1 3 4. A method of detecting a lung cancer cell in a biological sample from a 

2 patient, the method comprising contacting the biological sample with an antibody of claim 

3 28. 

1 35. The method of claim 34, wherein the antibody is further conjugated to 

2 an effector component. 

1 36. The method of claim 35, wherein the effector component is a 

2 fluorescent label. 

1 37. A method of detecting antibodies specific to lung cancer in a patient, 

2 the method comprising contacting a biological sample from the patient with a polypeptide 

3 encoded by a nucleic acid comprises a sequence from Tables 1 A-l 6. 

1 3 8 . A method for identifying a compound that modulates a lung cancer- 

2 associated polypeptide, the method comprising the steps of: 

3 (i) contacting the compound with a lung cancer-associated polypeptide, the 

4 polypeptide encoded by a polynucleotide that selectively hybridizes to a sequence at least 

5 80% identical to a sequence as shown in Tables 1A-16; and 

6 (ii) determining the functional effect of the compound upon the polypeptide. 

1 39. The method of claim 38, wherein the functional effect is a physical 

2 effect. ' 

1 40. The method of claim 38, wherein the functional effect is a chemical 

2 effect. 

1 41 . The method of claim 38, wherein the polypeptide is expressed in a 

2 eukaryotic host cell or cell membrane. 

1 42. The method of claim 38, wherein the functional effect is determined by 

2 measuring ligand binding to the polypeptide. 

1 43 . The method of claim 38, wherein the polypeptide is recombinant. 
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1 44. A method of inhibiting proliferation of a lung cancer-associated cell to 

2 treat lung cancer in a patient, the method comprising the step of administering to the subject a 

3 therapeutically effective amount of a compound identified using the method of claim 38. 

1 45 . The method of claim 44, wherein the compound is an antibody. 

1 46. The method of claim 45, wherein the patient is a human. 

1 47. A drug screening assay comprising the steps of 

2 (i) administering a test compound to a mammal having lung cancer or a cell 

3 isolated therefrom; 

4 (ii) comparing the level of gene expression of a polynucleotide that selectively 

5 hybridizes to a sequence at least 80% identical to a sequence as shown in Tables 1A-16 in a 

6 treated cell or mammal with the level of gene expression of the polynucleotide in a control 

7 cell or mammal, wherein a test compound that modulates the level of expression of the 

8 polynucleotide is a candidate for the treatment of lung cancer. 

1 48. The assay of claim 47, wherein the control is a mammal with lung 

2 cancer or a cell therefrom that has not been treated with the test compound. 

1 49. The assay of claim 47, wherein the control is a normal cell or mammal. 

1 50. A method for treating a mammal having lung cancer comprising 

2 administering a compound identified by the assay of claim 47. 

1 5 1 . A pharmaceutiPcal composition for treating a mammal having lung 

2 cancer, the composition comprising a compound identified by the assay of claim 47 and a 

3 physiologically acceptable excipient. 
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